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PEPTIDE LIBRARY AND SCREENING METHOD 

CROSS-REFERENCE TO RELATED APPLICATIONS 
The present application is a continuation-in-part of 
USSN 03/290,641, filed 3/15/94, which is a continuation of 
USSN 07/953,321, now US 5,338,665, filed October 15, 1992, 
which is a continuation-in-part of USSN 07/773,233, filed 
October 16, 1991 now US 5,270,170, and is related to copending 
USSN 07/517,659, filed May 1, 1990, and to copending USSN 
07/541,103, filed June 20, 1990, which is a 

continuation-in-part of copending USSN 07/713,577, filed June 
20, 1991, each of which is incorporated by reference in its 
entirety for all purposes. 

FIELD OF THE INVENTION' 
The present invention relates generally to methods 
for selecting peptide ligands to receptor molecules of 
interest and, -era particularly, to methods for generating and 
screening large peptide libraries for peptides with desired 
binding characteristics. 

BACKGROUND OF THE INVENTION 
The isolation of ligands that bind biological 
receptors is fundamental to understanding signal transduction 
and to discovering new therapeutics. The ability to 
synthesize DNA chemically has made possible the construction 
of extremely large collections of nucleic acid and peptide 
sequences as potential ligands. Recently developed methods 
allow efficient screening of libraries for desired binding 
activities (see Pluckthun & Ge, Angew . Chen. Int. Ed. Engl. 
30, 296-293 (1991). For example, RNA molecules with the 
ability to bind a particular protein (see Tuerk « Gold, 
Scier.ce 249, 505-510 (1990) or a dye (see Ellington * Szostak, 
Nature 346, 313-322 (19=3) have been selected by alternate 
rounds c: affinity selection and ?CR amplification. A similar 
technique was used to determine the DNA sequences that bound a 
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human transcription factor (see Thiesen £ Bach, Xucl . Acids 
Res. 13, 3203-3209 (1990)). 

Application of efficient screening techniques to 
peptides requires the establishment of a physical cr logical 
connection between each peptide and the nucleic acid that 
encodes the peptide. After rounds of affinity enrichment, 
such a connection allows identic ication, usually by 
amplification and sequencing, of the genetic material encoding 
interesting peptides. Several phage based systems for 
screening proteins and polypeptides have been described. The 
fusion phage approach of Parmley and Smith, 1933, Gene 73, 
305-313, can be used to screen proteins. Others have 
described phage based systems in which the peptide is fused to 
the pill coat protein of filamentous phage (see Scott & Smith, 
Science 249, 335-390 (1990); Devlin et al . , Science 249, 
404-406 (1990); and Cwiria et al . , Free. Satl . Acad. Sci . USA 
37, 6373-5332 (1990); each of which is incorporated herein by 
reference) . 

In these iatcar publications, the authors describe 
expression of a peptide at the amino terminus of or internal 
to the pill protein. The connection between peptide and the 
genetic material that encodes the peptide is established, 
tec-use the fusion protein is cart of the capsid enclosing the 
phage genomic C>«'A. Phage encoding peptide ligands for 
receptors of interest can be isolated from libraries of 
greacer than 1C 3 peptides after several rounds of affinity 
enrichment followed by phage growth. Other non-phage based 
systems that could be suggested for the construction of 
peptide libraries induce direct screening of nascent peptides 
or. polysomes (see Tuerk i Gold, supra) and display of peptides 
directly on the surface of I. ecli. As in the filamentous 
phage system, all of these methods rely on a physical 
association of the peptide with the nucleic acid that encodes 
the peptide. 

There remains a need for methods of constructing 
peptide libraries in addition to the methods described above. 
For instance, the above methods do net provide random peptides 
with a free carbcxy terr.ir.us, yet such peptides would add 
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diversity to the peptide structures now available for receptor 
binding. In addition, prior art methods for constructing 
random peptide libraries cannot tolerate stop codons in the 
degenerate region coding for the random peptide, yet stco 
codons occur with frequency in degenerate oligonucleotides. 
Prior art methods involving phage fusions require that the 
fusion peptide be exported to the periplasm and so are limited 
to fusion proteins that are compatible with the protein export 
apparatus and the formation of an intact phage coat. 

The present invention provides random peptide 
libraries and methods for generating and screening these 
libraries with significant advantages over the prior art 
methods . 



SUMMARY OF THE INVENTION 
The present invention provides random peptide 
libraries and methods for generating and screening those 
libraries to identify peptides that bind to receptor molecules 
of interest. The peptides can be used for therapeutic, 
diagnostic, and related purposes, e.g., to bind the receptor 
cr an analogue of the receptor and so inhibit or promote the 
activity of the receptor. 

The peptide library of the invention is constructed 
sz that the peptide is expressed as a fusion product; the 
peptide is fused to a DNA binding protein. The peptide 
library is constructed so that the DMA binding protein can 
bind to the recombinant DNA expression vector that encodes the 
fusion product that contains the peptide of interest. The 
method of generating the peptide library of the invention 
comprises the steps of (a) constructing a recombinant ON A 
vector that encodes a DNA binding protein and contains a 
binding site for the DNA binding protein; (b) inserting into 
the ceding sequence of the DNA binding protein in the vector 
of step (a) a coding sequence for a peptide such that the 
resulting vector encodes a fusion protein composed of the DNA 
binding protein and the peptide; (c) transforming a host ceil 
with the vector of step (b) ; and (d) culturing the host ceil 
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transformed in step (c) under conditions suitable for 
expression of the fusion protein. 

The screening method of the invention comprises the 
steps of (a) lysing the cells transformed with the peptide 
library under conditions such that the fusion protein remains 
bound to the vector that encodes the fusion protein; 
(b) contacting the fusion proteins of the peptide library with 
a receptor under conditions conducive tc specific peptide - 
receptor binding; and (c) isolating the vector that encodes a 
peptide that binds to said receptor. By repetition of the 
affinity selection process one or more times, the plasmids 
encoding the peptides of interest can be enriched. By 
increased stringency of the selection, peptides of 
increasingly higher affinity can be identified. 

The present invention also relates to recombinant 
DNA vectors useful for constructing the random peptide 
library, the random peptide library, host cells transformed 
with the recombinant vectors of the library, and fusion 
proteins expressed by these host cells. 

BRIE? DESCRIPTION Or THE OFFINGS 
Fig. 1 shews a recombinant vector of a random 
peptide library of the invention. In this embodiment of the 
invention, the DN'A binding proteir. is the lacl gene product, 
the fusion protein forms a tetramer, and the tetramer 
interacts with the vector and immobilized receptor, as shown 
in the Figure. The library plasmid carries the lad gene with 
random coding sequence fused to the 3* end of the coding 
sequence of the gene, as well as two lacO sequences. The lac 
repressor-peptide fusions produced by the hybrid genes bind tc 
the lacO sites on the same plasmid that encodes them. After 
lysis of ceils containing the random library, those 
plasmid-repressor-peptide complexes that specifically bind a 
chosen receptor are enriched by avidity panning against the 
immobilized receptor. Transformation of Z. coli with 
recovered plasmids allows additional rounds of panning or 
sequencing of isolated clones. 
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rig. 2 [SEQ ID N05:l-5] shews a partial restriction 
site, DNA sequence, and function map of plasmid pMC5 . 
Hybridization of oligonucleotide ON~-332 to oligonucleotides 
ON-3 59 and ON-370 produces a fragment with cohesive ends 
5 compatible with Sfil, Hindlll digested plasmid pMC5. The 

ligation product adds sequence coding for twelve random amino 
acids to the end of lacl through a six coden linker. The 
library plasmid also contains: the rrn3 transcriptional 
terminator, the bla gene to permit selection on ampicillin, 

10 the Ml 3 phage intragenic region to permit rescue of 

single-stranded DMA , a piasmid replication origin (ori) , two 
lacOs sequences, and the araC gene to permit positive and 
negative regulation of the ara3 promoter that drives 
expression of the lacl fusion gene. 

15 fig. 3 [SEQ ID >;0S:7-64] shows sequences isolated by 

panning with the D32.3 3 antibody. Each sequence is listed 
with a clone number, the panning round in which the clone was 
isolated, and the result of the ELI 3 A with 332.39 antibody. 
The sequences are aligned to shew the C2 2.39 epitope that they 

20 share (box) . 

Fig. 4 [ SEQ ID MCS:33-31] shows the linker sequences 

from vectors p~S141 and pJS142. 

Tig. 5 f SEQ ID NO : 55 ] : Arrangement of lac 
headpieces, linkers and displayed peptide m headpiece diner. 

25 Fig. 5 f SEQ ID NOS:87 AMD 92-122 ]: Sequences cf 

headpiece dimer proteins. (a) Sequence of headpiece domains 
and adjoining linkers as constructed for the headpiece dimer 
linker library. (b) Protein sequence of linker library clones 
isolated after four rounds of panning selection, showing 

:: linker sequences and residue changes frcn the original 
headpiece protein sequence where indicated. Unchanged 
residues are narked with a dot residue deletions are 

noted with a hyphen (c) Protein sequences of clones 

isolated after mutagenesis and four rounds of panning 

15 selection. Unsequenced positions are noted with question 
marks . 

rig. 7 [SEQ ID NCS : 123-12 5 ] : Construction of 
heacpiece dimer libraries in vector ?CMG14. (a) Restriction 
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map and positions of ger.es . The library plasmid includes: the 
rm3 transcriptional terminator, the bla gene to pemi: 
selection on ampicillin, the M13 phage intragenic region (Ml 3 
IG) to permit rescue of single-stranded DNA, a plasmid 
replication origin (cri) , one iacO s se^ence, and the araC 
gene to permit positive and negative regulation of the ara3 
promoter driving expression of the headpiece dimer fusion 
gene. (b) Sequence of the cloning region at the 3 1 end of the 
headpiece dimer gene, including the 'Sfil and Eagl sites used 
during library construction. (c) Ligation of annealed 
ON-1679, CN-329, and CN-330 to Sfil sites of pCMG14 to produce 
a library. Single spaces in the sequence indicate sites of 
ligation . 

rig. 3 [SEQ ID NOS : 127-162 ] : Sequences of D32.39 
MAb-specif ic peptides isolated from random libraries after 
four rounds of panning. Peptides derived from the headpiece 
dimer library are preceded by "HpD" , sequences from the lad 
pepcides-on-plasmids library are preceded by "lad". The 
isolate numbers correspond to those in rig. 9. The boxed 
portion represents the alignment of peptide sequence with the 
known 03 2.3 9 monoclonal antibody epitope RQrXVYT [SZQ ID 
NO : 55 j . 

Fig. 9: M3? ELISA using pep-ices isolated from 
headpiece dimer and lad pept ides-on-p iasmids random 
Iiorarias. High and low affinity control peptides are 
expressed by plasmids cCMC-3 9 and pCMG33, respectively. pZLMj 
negative control encodes M3? with an irrelevant fusion 
peptide. Random library clones are numbered as in Fig. S. 

DESCRIPTION OF THE SPECIFIC EMBODIMENTS 
For purposes of clarity and a complete understanding 
of the invention, the following terms are defined. 

"DNA 3inding Protein" refers to a protein that 
specifically interacts with deoxyr ibonucieot ide strands. A 
sequence-specific DNA binding protein binds to a specific 
sequence or family of specific sequences showing a high degree 
of sequence identity with each other (e.g., at least about 30% 
sequence identity) with at least 100-fold greater affinity 
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than to unrelated sequences. The dissociation constant cf a 
sequence-specific DNA binding protein to its specific 
sequence (s) is usually less than about 100 nM, and may be as 
low as 10 nM, 1 nM, 1 cM or 1 fM. A ncnsequence-specif ic DNA 
binding protein binds to a plurality of unrelated DNA 
sequences with a dissociation constant that varies by less 
than 100-fold, usually less than tenfold, to the different 
sequences. The dissociation constant of a nonsequence- 
specific DNA binding protein to the plurality of sequences is 
usually less than about 1 mM. In embodiments of the invention 
in which RNA vectors are used, DNA binding protein can also 
refer to an RNA binding protein. 

"Epitope" refers to that portion of an antigen that 
interacts with an antibody. 

"Host Cell" refers to a eukaryctic or prccaryotic 
ceil or group of cells that can be or has been transformed by 
a recombinant DNA vector. For purposes cf the present 
invention, a host cell is typically a bacterium, such as an 
5. cell X12 cell cr an Z. coii 3 cell. 

"Ligand" refers to a molecule, such as a random 
peptide, that is recognized by a particular receptor. 

"Ligand Fragment" refers to a portion of a gene 
encoding a ligand and to the portion of the ligand encoded by 
that ger.e fragment. 

"Ligand Fragment Library" refers not only to a set 
of recombinant DMA vectors that encodes a set cf ligand 
fragments, but also tc the set of ligand fragments enccded by 
those vectors, as well as the fusion proteins containing those 

ligand fragments. 

"Linker" cr "spacer" refers to a molecule or group 
of inolecules that connects two molecules, such as a DNA 
binding orotein and a random peptide, and serves to place the 
two molecules in a preferred configuration, e.g., so that the 
random peptide can bind to a receptor with minimal steric 
hindrance frcm the DNA binding protein. 

"Peptide" cr "polypeptide" refers to a polymer in 
which the monomers are alpha amino acids joined together 
through amide bonds. Peptides are two or often, more amino 
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acid monomers long. Standard abbreviations for amino acids 
are used herein (see Stryer, 1933, Biochemistry, Third Ed., 
incorporated herein by reference.) 

"Random Peptide" refers to an oligomer composed of 
two or more amino acid monomers and constructed by a 
stochastic or random process. A random peptide can include 
framework cr scaffolding motifs, as described below. 

"Random Peptide Library" refers not only to a set of 
recombinant DNA vectors that encodes a set cf random peptides, 
but also to the set of random peptides encoded by those 
vectors, as well as the fusion proteins containing those 
random peptides. 

"Receptor" refers to a molecule that has an affinity 
for a given ligand. Receptors can be naturally occurring cr 
synthetic molecules. Receptors can be employed in an 
unaltered state or as aggregates with other species. 
Receptors can be attached, covalently cr r.cncovalently , to a 
binding member, either directly or via a specific binding 
substance. Examples of receptors include, but are not limited 
to, antibodies, including monoclonal antibodies and ant is era 
reactive with specific antigenic determinants (such as on 
viruses, cells, or other materials), ceil membrane receptors, 
e n z ym e s , and hormone receptors. 

"Recombinant DNA Vector" refers to a CNA or RNA 
molecule that encodes a useful function and can be used to 
transform a hcst cell. For purposes of the present invention, 
a recombinant DNA vector typically is a phage or plasmid and 
can be extra chrome soma 1 iy maintained in a host ceil cr 
controllably integrated into and excised from a host ceil 
chromosome . 

The present invention provides random peptide 
libraries and methods for generating and screening those 
libraries to identify either peptides that bind to receptor 
molecules of interest cr gene products that modify peptides or 
RNA in a desired fashion. The peptides are produced from 
libraries of random peptide expression vectors that encode 
peptides attached to a DNA binding protein. A method of 
affinity enrichment allows a very large library of peptides to 
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be screened and the vector carrying the desired peptide { 3 j to 
be selected. The nucleic acid can then be isolated from the 
vector and sequenced to deduce the amino acid sequence cf the 
desired peptide. Using these methods, one can identify a 
5 peptide as having a desired binding affinity for a molecule. 
The peptide can then be synthesized in bulk by conventional 
means . 

By identifying the peptide da novo, one need not 
know the sequence cr structure of the receptor molecule or the 

10 sequence or structure of the natural binding partner of the 
receptor. Indeed, for many "receptor" molecules a binding 
partner has not yet been identified. A significant advantage 
of the present invention is that no prior information 
regarding an expected iigand structure is required to isolate 

15 peptide ligands of interest. The peptide identified will have 
biological activity, which is meant to include at least 
specific binding affinity for a selected receptor molecule 
and, in some instances, will further include the ability to 
block the binding of other compounds, to stimulate or inhibit 

20 metabolic pathways, to act as a signal or messenger, to 
stimulate cr inhibit cellular activity, and the like. 

The number cf possible receptor molecules for which 
peptide ligands may be identified by means of the present 
invention is virtually unlimited. For example, the receptor 

25 molecule may be an antibody (or a binding portion thereof) . 
The antigen to which the antibody binds may be known and 
perhaps even sequenced, in which case the invention may be 
used to map epitopes of the antigen. If the antigen is 
unknown, such as with certain autoimmune diseases, for 

3C example, sera, fluids, tissue, or ceil from patients with the 
disease can be used in the present screening method to 
identify peptides, and consequently the antigen, that elicits 
the autoimmune response. One can also use the present 
screening method to tailor a peptide to a particular purpose. 

25 Once a peptide has beer, identified, that peptide can serve as, 
cr provide the basis for, the development of a vaccine, a 
therapeutic agent, a diagnostic reagent, etc. 
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The present invention can he used to identify 
peptide ligands for a vide variety cf receptors in addition to 
antibodies.' These ligands include, by way cf example and net 
limitation, growth factors, hormones, enzyme substrates, 
interferons, interleukir.s, intracellular and intercellular 
messengers, lectins, cellular adhesion molecules, and the 
like. Peptide ligands can also be identified by the present 
invention for molecules that are not peptides or proteins, 
e.gr., carbohydrates, non-protein organic compounds, metals, 
etc. Thus, although antibodies are widely available and 
conveniently manipulated, antibodies are merely representative 
of receptor molecules for which peptide ligands can be 
identified by means of the present invention. 

The peptide library is constructed so that the ON A 
binding protein-random peptide fusion product can bind to the 
recombinant DNA expression vector that encodes the fusion 
prcduct that contains "he peptide cf interest. The method of 
generating the peptide Library comprises the steps of 

(a) constructing a recombinant DNA vector that encodes a DNA 
binding protein and contains binding sites for the DNA binding 
protein; (b) inserting into the coding sequence of the DNA 
binding protein in a multiplicity of vectors of step (a; 
cccir.g sequences for random peptides such that the resulting 
vectors encode different fusion proteins, each cf which is 
composed of the DNA binding protein and a random peptide ; 

(c) transforming host ceils with the vectors cf step (b; ; and 

(d) culturing the host ceils transformed in step (c) under 
conditions suitable for expression of the fusion proteins. 
Typically, a random peptide library will contain at least 10° 
to 1C 3 different members, although library sizes of 1C 3 to 
10" can be achieved. 

The peptide library produced by this method is 
especially useful in screening for ligands that bind to a 
receptor of interest. This screening method comprises the 
steps cf (a) iysing the ceils transformed with the peptide 
library under conditions such that the fusion protein remains 
bound to the vector that encodes the fusion protein; 

(b) contacting the fusicn proteins of the peptide library with 
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a receptcr under conditions conducive to specific peptide - 
receptor binding; and ( c) isolating the vector that encodes a 
peptide that binds to said receptor. By repetition of the 
affinity selection process cna or more times, the vectors that 
encode the peptides of interest may be enriched. 3y increased 
stringency of the selection, peptides of increasingly higher 
affinity can 'be identified. If the presence of cytoplasmic or 
periplastic proteins interferes with binding of fusion protein 
to receptor, then partial purification of fusion protein- 
piasmid complexes by gel filtration, affinity, or other 
purification methods can be used to prevent such interference. 
For instance, purification of the cell lysate on a column 
(such as the Sephacryl 3-400 HR column) that removes small 
proteins and other molecules may be useful. 

The recombinant vectors of the random peptide 
library are constructed so that the random peptide is 
expressed as a fusion product; the peptide is fused to a DNA 
binding protein. A DN'A binding protein of the invention must 
exnibit high avidity binding to DNA and have a region that can 
accept insertions of amino acids without interfering with the 
DN'A binding activity. The half -life of a DN'A binding 
prctein-DN'A complex produced by practice of the present method 
must be long enough to allow screening to occur. Typically, 
the half-life will be at leas: 15 min and often between one to 
four hours or longer. 

Suitable DNA binding proteins for purposes of the 
present invention include proteins selected from a large group 
of known DN'A binding proteins including transcriptional 
regulators and proteins that serve structural functions on 
DN'A . Examples include: proteins that recognize DN'A by virtue 
of a helix-turn-helix motif, such as the phage 434 repressor, 
the lambda phage cl and cro repressors, and the E . colL CA? 
protein from bacteria and proteins from eukaryotic ceils that 
contain a homeobox helix-turn-helix motif; proteins containing 
the helix-loop-helix structure, such as myc and related 
proteins; proteins with leucine zippers and DN'A binding basic 
domains such as fcs and jun; proteins with ' ?OU * domains such 
as the Drosophiia paired protein; proteins with domains whose 
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structures depend on metal ion chelation such as Cys 2 His^ zinc 
fingers fcund in TFIIIA, Zn 2 (Cys) 5 clusters such as these 
found in yeast Gal 4, the Cys 3 His box fcund in retroviral 
nucleccapsid proteins, and the Zn 2 (Cys) g clusters found in 
5 nuclear hormone receptor-type proteins; the phage P22 Arc and 
Mnt repressors (see Knight et al . , J. Biol. Cham. 264, 
3639-3642 (1939) and 3ovie & Sauer, J. Biol. Chem . 264, 
7595-7602 (1939) each of which is incorporated herein by 
reference); and others. Proteins that bind DNA in a non- 
10 sequence-specific manner are also used, for example, histories, 
protamines, and HMG type proteins. In addition, proteins 
could be used that bind to DNA indirectly, by virtue of 
binding another protein bound to DNA. Examples of these 
include yeast Gal30 and adenovirus E1A protein. Phage coat 
15 proteins, which associate with DNA by encapsidation of the DNA 
in a phage coat, and are used in the phage display methods of 
screening peptides discussed in the Background are typically 
net employed in the present invention. 

Scr.e DNA binding proteins can be selected from the 
23 above list by virtue cf their possession of a dissociation 

half-life of at least fifteen nin. Data on DNA half-lives are 
available for several DNA binding proteins. For example, the 
arc represser of phage ?22 has a dissociation half-life cf 
3: :r. in (see, e.g., Knight ec ai . , J . Biol. Chem. 254, 
25 3529-3542 (1939), 7ersr.cn ec ai . , J . Moi . Biol. 195, 323-331 
(1937)). For other DNA binding proteins, dissociation half- 
life can be determined by standard biochemical procedures 
(see, e.g., Bourgeois, Methods Inzymol . 21D, 491-500 (1971) 
(filter binding assay), Knight i Sauer, J. Biol. Chem . 264, 
3: 1370 5-12 7 10 ( 193 9) ( DNA modification protection assay)). 

The lac reoressor is one of the many DNA binding 
proteins that can be used in the construction of the libraries 
of the invention. The lac represser, a 37 kDa protein, is the 
product of the I . coli iacl gene and negatively controls 
35 transcription cf the lacZYA opercn by binding to a specific 

DNA sequence called 2arO. Structure-function relationships in 
the lac repressor have been studied extensively through the 
construction cf thousands of amino acid substitution variants 
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of the protein (see Gordon at al . , j. Hoi . Biol. 200, 239-251 
(1983), and Kleina « Miller, J. Hoi. Biol. 212, 295-313 
(1990)). The repressor exists as a tetramer in its native 
fomn with two high affinity DNA binding domains forced bv the 
amino termini of the subunits (see Beyreuther, The Overcr. 
(Miller and Reznikoff, eds., Cold Spring Harbor Laboratory, 
1980), pp. 123-154). The two DNA binding sites exhibit strong 
cooperativity of binding to DNA molecules with two lacO 
sequences. A single tetramer can bind to suitably spaced 
sites on a plasmid, forming a loop of DNA between the two 
sites, and the resulting complex is stable for days (see 3esse 
et al., EHBO J. 5, 1377-1331 (1936),; Flashner & Graiia, Proc . 
Natl. Acad. Sci . USA 35, 3953-3972 (1933); Hsieh et al . , J. 
Biol. Che- . 262, 14533-14591 (1937); Kramer et al . , EMBO J. 
6, 1431-1431 (1937); Mossing S Record, Science 233, 339-392; 
and Whitscn et al . , J. Biol. Cher,. 252, 14592-14599 ( 1937)). 

The carboxy terminal domains of the lac repressor 
form the dimer and tetramer contacts, but significantly, 
fusions of proteins as large as ,5-ga lactos idase can be made to 
the carboxy terminus without eliminating the DNA binding 
activity of the repressor (see Muiier-Hiil and Xania, .Vature 
249, 551-553 (1974); and Brake et al . , Proc. Sazl . Acad. Sci. 
'JSA 75, 4324-4327 ( 1973)). The lac repressor fusion proteins 
of the present invention include net only carboxy terminus 
fusions but also amino terminus fusions and peptide insertions 
in the lac repressor. Substitutions of other sequences, 
including eukaryotic nuclear localisation signals, 
transcriptional activation domains, and nuclease domains, have 
been made at both the amino and carboxy termini of the lac 
repressor without serious disruption of specific DNA binding 
(see Hu and Davidson, Gene 99, 141-150 (1991); Labow et al . , 
Hoi. Cell. Biol. 10, 3343-3355 (1990); and Panayotatos et al . , 
J. Biol. Che-. 254, 15055-15059 (1939)). 

The binding of the lac repressor to a single 
wi id -type lacO is both tight and rapid , with a dissociation 
constant of 10" lj M , an association rate constant of 7 x 10 7 
M~-s~ : , and a half-life for the lac r epressor-IacO complex of 
about 30 mm. (see 3 ark lay and Bourgeois, 1930, The Operon 
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(Miller and Reznikoff, eds., Cold Spring Harbor Laboratory), 
pp. 177-220). The high stability of the lac rapresscr-DNA 
complex has" permitted its use in methods fcr identifying DNA 
binding proteins (see Levens and Kowley, Mol , Cell. Biol, 5, 
2307-2315 (1985)), for quantifying ?CR-anplif ied DNA (sea 
Lundeberg at al . , 3io/Cech. 10, 63-75 (1991)), and fcr 
cleavage of the Z . coll and yeast genomes at a single site 
(see Koob and Szybalski, Science 250, 271-273 (1990)). This 
stability is important for purposes of the present invention, 
because, for the affinity selection or "panning" step of the 
screening process to succeed, the connection between the 
fusion protein and the plasmid that encodes the fusion protein 
must remain intact fcr at least a portion of the complexes 
throughout the panning step. 

In fact, for purposes of the present invention, a 
longer half-life is preferred. A variety of techniques can be 
used to increase the stability of the DNA binding prctein-DNA 
complex. These techniques include altering the amino arid 
sequence of the DNA binding protein, altering the DNA sequence 
of the DN'A binding site, increasing the mincer of DNA binding 
sices on the vector, adding compounds that increase the 
stability of the complex (such as lactose cr ON?? for the lac 
system), arc various combinations of each o : these techniques. 

An illustrative random peptide library cloning 
vector of the invention ; plasmid pMC5 , demonstrates some of 
these techniques. Plasmid pMC5 has two lacG sequences to take 
advantage of the strong cooperative interaction between a lac 
repressor tetramer and two lac repressor binding sites, and 
each of these sequences is the symmetric variant of the iacO 
sequence, called IacO. or iacO-^, which has about ten fold 
higher affinity for repressor than the wild-type sequence (see 
Sadler et al . , Proc . Nazi. Acad. Sci . USA 30, 6735-6739 
(1933 ), and Simons et al . , Proc. Natl. Acad. Sci. USA 31, 
1624-1623 (1934)). Other "tight-binding" lac repressors and 
ceding sequences fcr those repressors that can be used for 
purposes of the present invention are described in Maurizot 
and Grebert, FI3S Levzrs . 229(1), 1C5-103 (1933), incorporated 
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herein by reference. See also Lehming et al . , EH30 J. 6(10} t 
3145-3153 (1937) . 

Plasmid pMC5 is shown in rigs. 1 and 2, and derails 
of the construction of the piasmid are in Example 1, below. 
5 This library piasmid contains two major functional elements in 
a vector that permits replication and selection in I. cell. 
The lad gene is expressed under the control of the ara3 
promoter and has a series of restriction enzyme sites at the 
3* end of the gene. Synthetic oligonucleotides cloned into 

10 these sites fuse the lac repressor protein coding sequence to 
additional random peptide coding sequence. 

Cnce a vector such as pMC5 is constructed, one need 
only clone peptide ceding sequences in frame with the DN'A 
binding protein coding sequences to obtain a random peptide 

15 library of the invention. Thus, the random peptide library c: 
iha invention is constructed by cloning an oligonucleotide 
that contains the random peptide coding sequence (and any 
spacers, framework determinants, etc., as discussed below) 
into a selected cloning site of a vector that encodes a DN'A 

20 binding protein and binding sites for that protein. 

using known recombinant DN'A techniques (see 
generally, Sanbrooke ez al . , Molecular Cloning, A Latora-ory 
y.ar.-al , 2d ed . , Cold Spring Harbor Laboratory Press, Cold 
String Harbor, N . Y . , 1959, incorporated herein by reference), 

15 one can synthesize an oligonucleotide that, inter alia, 

removes unwanted restriction sites and adds desired ones , 
reconstructs the correct portions of any sequences that have 
been removed, inserts the spacer, conserved, or framework 
residues, if any, and corrects the translation frame (if 

3 : necessary) to produce an active fusion protein comprised of a 
DNA binding protein and random peptide. The central portion 
of the oligonucleotide will generally contain one or more 
random peptide coding sequences (variable region domain) and 
spacer or framework residues. The sequences are ultimately 

2 5 expressed as peptides (with or without spacer or framework 
residues) fused to or m the DN'A binding protein. 

The variable region domain of the oligonucleotide 
encodes a key feature of the library: the random peptide. The 
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size of the library will vary according to iha number of 
variable ccdons, and hence the size of the peptides, that are 
desired. Generally, the library will be at least 10 5 tc 10 s 
or more members, although smaller libraries nay be quite 
useful in some circumstances. To generate the collection of 
oligonucleotides that forms a series of ccdons encoding a 
random collection of amino acids and that is ultimately cloned 
into the vector, a codcn motif is used, such as (NNK) X , where 
N may be A, C, G, or T (nominally eguimolar) , K is G or T 
(nominally equimolar) , and x is typically up to about 5, 6, 7, 
or 3 or more, thereby producing libraries of penra-, hexa-, 
hepta-, and octa-peptides or more. The third position may 
also be G or C, designated "S". Thus, NN™< or NNS (i) code for 
all the amino acids, (ii) code for only one step codon, and 
(iii) reduce the range of codon bias from 6:1 to 3:1. There 
are 32 possible codons resulting from the NNX motif: 1 for 
each of 12 amino acids, 2 for each of 5 amino acids, 3 for 
each of 3 amino acids, and only one of the three stop ccdons. 
With longer peptides, the size of the library that is 
generated can become a constraint in the cloning process, but 
the larger libraries can be sampled, as described below. The 
expression of peptides iron randomly generated mixtures of 
oligonucleotides in recombinant vectors is discussed in 
Clichant e: al . , Gene 44, "7-133 (1336), incorporated herein 
by reference. 

An exemplified codon motif (N'NK ) x produces 32 
codons, one for each of 12 amino acids, two for each of five 
amino acids, three for each of three amino acids and one 
(amber) stop codon. Although this motif produces a codcn 
distribution as equitable as available with standard methods 
of oligonucleotide synthesis, it results in a bias against 
peptides containing one-codon residues. For example, a 
complete collection of hex a codons contains one sequence 
encoding each peptide made up of only one-ccdon amino acids, 
but contains 729 (3 6 ) sequences encoding each peptide with 
three- cod on amino acids. 

An alternate approach that minimizes the bias 
acainst one-ccdon residues involves the svnthesis cf 2G 
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activated trinucleotides , each representing the codon for one 
of the 20 genetically encoded amino acids. These 
trinucleotides are synthesized by conventional means, removed 
from the support with the base and 5-OK-protecting groups 
5 intact, and activated by the addition of 3 ' -O-phosphoranidite 
(and phosphate protection with beta-cyanoethyl groups) by the 
method used for the activation of mononucleosides , as 
generally described in McBride and Caruthers, Tetr. Letters 
22, 245 (1983), which is incorporated herein by reference. 

10 Degenerate " oligocodons" are prepared using these 

trimers as building blocks. The trimers are mixed at the 
desired molar ratios and installed in the synthesizer. The 
ratios will usually be approximately equimolar, but may be a 
controlled unequal ratio to obtain the ever- to under- 

15 representation of certain amino acids coded for by the 

degenerate oligonucleotide collection. The condensation of 
the trimers to form the ciigecodons is cone essentially as 
described for conventional synthesis employing activated 
mcnonuciecsides as building blocks. See generally, Atkinson 

2C and Smith, Oligonucleotide Synthesis (M.J. Gait, ed.), 

pp. 35-32 ( 1934) . This procedure generates a population of 
oligonucleotides for cloning that is capable of encoding an 
ecru?. 1 distribution (or a controlled unequal distribution) o: 
the possible peptide sequences. This approach may be 

25 especially useful in generating longer peptide sequences, 
because the range of bias produced by the (N'NX) x motif 
.increases by three-fold with each additional amino acid 
res idue . 

When the codon motif is (N'NK) x , as defined above, 
:■ : and when x equals 3, there are 2.6 x 1Q- 0 possible 

cctapeptides. A library containing most of the octapeptides 
may be produced, but a sampling of the cctapeptides may be 
more conveniently constructed by making only a subset library 
using about 0.1%, and up to as much as 1%, 5%, or 10%, of the 
25 possible sequences, which subset of recombinant vectors is 
then screened. As the library size increases, smaller 
percentages are acceptable. If desired, to extend the 
diversity of a subset library the recovered vector subset may 
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be subjected to mutagenesis and then subjected to subsequent 
rounds of screening. This mutagenesis step nay be 
accomplished in two general ways: the variable region of the 
recovered phage may be mutagenizad or additional variable 
5 amino acids may be added to the regions adjoining the initial 
variable sequences. 

The process of constructing a random peptide 
encoding oligonucleotide is described in Example 2, below. In 
brief, a library can be constructed in pKC5 using the half - 

10 site cloning strategy of cvirla et al . , s^pra. A random 

dodecamer peptide sequence, connected to the C-terminus of the 
lac repressor through a linker peptide GADGGA (GADGA [ SZQ ID 
NO:55]) would also be an acceptable linker), can be specified 
by a degenerate oligonucleotide population containing twelve 

15 cadons of the form NNX, where N is any base, and K is G or T. 
Transformation of I . cell strain MC1061 using 4 ,ug of pMC5 
ligatad to a four fold molar excess of annealed 
oligonucleotides yielded a test library of 5.5 x 10 3 
independent clones. 

20 Once the library is constructed, host cells are 

transformed with the library vectors. The successful 
transf crmants are typically selected by growth in a selective 
medium or under selective conditions, e.g., an appropriate 
antibiotic, whicn, in the case of plastic p v .C5 derivatives, is 

25 preferably ampiciliin. This selection may be done on solid or 
in liquid growth medium. For growth on solid medium, the 
ceils are grown at a high density ("10 3 to 10" transf crmants 
per m 2 ) on a large surface of, for example, L-agar containing 
the selective antibiotic to form essentially a confluent lawn. 

3: For growth in liquid culture, cells may be grown in L-brcth 
(witn antibiotic selection) through about 10 or more 
doublings. Growth in liquid culture may be more convenient 
because of the size of the libraries, while growth on solid 
media likely provides less chance of bias curing the 

35 amplification process. 

For best results with the present method, one should 
control the ratio of fusion proteins to vectors so that 
vectors are saturated with fusion proteins, without a vast 
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excess of fusion protein. Tco little fusion protein could 
result in vectors with free binding sites that might be filled 
by fusion protein from ether cells in the population during 
cell lysis, thus breaking the connection between the genetic 
5 information and the peptide ligand . Tco much fusion protein 
could lead to titration of available receptor sites during 
panning by fusion protein molecules not bound to plasmid. To 
control this ratio, one can use any of a variety of origin of 
replication sequences to control vector number and/or an 

10 inducible promoter, such as any of the promoters selected from 
the group consisting of the axa3, lambda pL, (which can be 
either nalidixic acid or heat inducible or both) , trp, lac, 
T7 , T3 , and tac or trc (these latter two are trp/ lac hybrids) 
promoters to control fusion protein number . A regulated 

15 promoter is also useful to limit the amount of time that the 
peptide iigands are exposed to cellular proteases. 3y 
inducing the promoter a short time before iysing the ceils 
containing a library, one can minimize the time during which 
proteases act. 

20 The ara3 promoter normally drives expression of the 

enzymes cf the E. call ara3AD opercn, which are involved in 
the eatabolism cf L-arabincse . The ara3 promoter is regulated 
both positively and negatively, depending on the presence of 
L-arabincse in the growth medium, by the araC protein. This 

2 5 promoter can be catabclite repressed by adding glucose to the 

growth medium and induced by adding L-arabinose to the medium. 
Plasmid c M C 5 encodes and can drive expression of the araC 
protein (see Lee, The C per or. (Miller and Reznikoff, eds . , Cold 
Spring Harbor Laboratory), pp. 339-409 (1930)). The ara3 

3 3 promoter is also regulated by the CAP protein, an activator 

involved in the E. coli system of catabolite repression. 

The expression level of the lacl fusion gene under 
the control cf the ara3 promoter in plasmid pMC5 can be 
controlled over a very wide range through changes in the 
3 5 growth medium. One can construct a vector to measure 

expression of a fusion protein encoding gene to determine tne 
growth conditions needed to maintain an acceptable ratio of 
repressers to vectors. Plasmid pXC3 is such a vector and can 
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be constructed by attaching an oligonucleotide that encodes a 
short peptide linker (GADGA [S2Q ID NO: 55]) followed by 
dynorphin 3 ( YGGFLRRQrKWT [ 5EQ ID NO : 7 ] ) to the lad gene in 
plasmid pMC5. Monoclonal antibody D3 2.3 9 binds to 
5 dynorphin 3, a 13 amino acid opioid peptide (see Barrett and 
Goldstein, Neuropeptides 6, 113-120 (1395), incorporated 
herein by reference). These same reagents, plastics pMC3 and 
pMC5 and receptor D32.39, provide a test receptor and positive 
and negative controls for use in panning experiments, 

10 described below. Growth of £. coli transf ormants harboring 
plasmid pMC3 in L3 broth (10 g of tryptone , 5 g of NaCl, and 
5 g of yeast extract per liter) allowed detection in a Western 
blot of a faint band cf the expected molecular weight, while 
addition of 0.2% glucose rendered this band undetectable. 

15 Growth in L3 plus 0.2% L-arabinose led to the production cf a 
very heavy band on a stained gel, representing greater than 
25% cf the total cell protein. 

To prevent overproduction of the fusion protein 
encoded by a plasmid pMC5 derivative (or any other vector of 

20 the present invention that has an inducible promoter) , one can 
grow the transf ormants first under non-inducing conditions (to 
minimize exoosure of the fusion protein to cellular proteases 
and to minimi ze exposure cf the cell to the possibly 
deleterious effects cf the fusion protein) and then under 

25 "partial induction" conditions. For the asa3 promoter, 

partial induction can be achieved with as little as 3.3 x 
I0~ 5 % cf L-arabinose (as demonstrated by increased repression 
in the assay described below) . A preferred way to achieve 
partial induction consists of growing the cells in 0.1% 

2 2 glucose until about 30 min. before the cells are harvested; 
then, 0.2 to 0.5% L-arabinose is added to the culture to 
induce expression of the fusion protein. Other methods to 
express the protein controllably are available. 

One can estimate the lad expression level necessary 

25 to fill the available binding sites in a typical plasmid pMC5 
derivative by observing the behavior of strain ARI 20 (lad 
IacZYA + ) transformed with p>!C3 or pMC5 (encoding only the 
linker peptide GADGA ID NO: 55 < ) . Because the lacC sites 
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in plasmids pMC3 and pMCS have higher affinity than those in 
the lacZYA opercn, the available represser should fill the 
plasmid sites first. Substantial repression 'of lacZYA should 
be observed only if there is an excess of repressor beyond the 
5 amount needed to fill the plasmid sites. As shown by color 
level on X-gal indicator plates and direct assays of 
0-galactosidase (sea Miller, Experiments in Molecular Genetics 
(Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 
(1972)), incorporated herein by reference), the amount of 

10 repressor produced by pMC5 is sufficient to fill the lacO 

sites and repress greater than 200 fold lacZYA in ARI 20 host 
cells during growth in normal L3 medium (2.4 units compared to 
500 units from A3I 2 0 transformed with vector pBADlo , which 
has no lad) . The repressor encoded by pMC3 was partially 

15 inactivated by the addition of the dynorphin 3 tail, allowing 
about 10 fold higher expression of lacZYA (37 units)- Because 
of the apparent excess production of repressor under these 
conditions, L3 is a preferred medium for expressing similar 
fusion proteins of the invention. 

2 0 At some point during the growth of the 

transf crmants , the fusion protein will be expressed. Because 
the random peptide vector also contains DNA binding sites for 
the DMA binding protein, fusion proteins will bind to the 
vectors that encode them. After these complexes form, the 
25 cells containing a library are lysed, and the complexes are 
partially purified away from cell debris. Following cell 
lysis, one should avoid cross reaction between unbound fusion 
proteins of one cell with heterologous DMA molecules of 
another cell. The presence of high concentrations of the DNA 

3 3 binding site for the CNA binding protein will minimize this 

type of cross reaction. Thus, for the lac system, one can 
synthesize a DNA duplex encoding the lacO or a mutated lacO 
sequence for addition to the cell lysis solution. The 
compound ON??, as well as lactose, is known to strengthen the 
35 binding of the lac repressor to lacO, so one can also, or 

alternatively, add ON?? or lactose to the cell lysis solution 
to minimise this type cf zross reaction. 
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After cell lysis, in a process called panning, 
piasmid-peptide complexes that bind specifically to 
immobilized receptors are separated from nonbinding complexes, 
which are washed away. 3ulk ON A can be induced during the 
5 lysis and panning steps to ccnpete for non-specific binding 
sites and to lower the background of non-receptor-specific 
binding to the immobilized receptor. A variety of washing 
procedures can be used to enrich for retention of molecules 
with desired affinity ranges. For affinity enrichment of 

10 desired clones, from about 10 2 to 10 6 library equivalents (a 

library equivalent is cne of each recombinant; 10 4 equivalents 
of a library of 10 9 members is 10 13 vectors) , but typically 
1C 3 to 10 4 library. equivalents, are incubated with a receptor 
(or portion thereof) fcr which a desired peptide ligand is 

15 desired. The receptor is in cne of several forms appropriate 
for affinity enrichment schemes. In one example the receptor 
is immobilized on a surface or particle, and the library is 
then panned cn the immobilized receptor generally according to 
the procedure described below. 

20 A second example of receptor presentation is 

receptor attached to a recognizable ligand (which may be 
attached via a spacer) . A specific example of such a ligand 
is biotin. The receptor, so modified, is incubated with the 
library , and binding occurs with both reactants in solution. 

25 The resulting complexes are then bound to streptavidin (or 

avidin) through the biotin moiety. See ?CT patent publication 
N'o . 91/07037. The streptavidin may be immobilized cn a 
surface such as a plastic plate or on particles, in which case 
the complexes (vector/DNA binding protein/ peptide/ receptor/ 

3: bio tin/ streptavidin) are physically retained; or the 

streptavidin may be labelled, with a fluorophore, for example., 
to tag the active fusion protein for detection and/or 
isolation by sorting procedures, e.g., on a fluorescence- 
activated cell sorter. 

25 Vectors that express peptides without the desired 

specificity are removed by washing. The degree and stringency 
cf washing required will be determined for each 
receptor/peptide of interest. A certain degree of control can 



WO 96/40987 



PCT/US96/C9809 



23 

be exerted over the binding characteristics of the peptides 
recovered by adjusting the conditions of the binding 
incubation and the subsequent washing. The temperature, pH, 
ionic strength, divalent cation concentration, and the volume 
and duration of the washing will select: for peptides within 
particular ranges of affinity for the receptor. Selection 
based on slow dissociation rate, which is usually predictive 
of high affinity, is the most practical route. This may be 
done either by continued incubation in the presence of a 
saturating amount of free ligand, or by increasing the volume, 
number, and length of the washes. In each case, the reminding 
of dissociated peptide-vector is prevented, and with 
increasing time, peptide-vectors of higher and higher affinity 
are recovered. Additional modifications of the binding and 
washing procedures may be applied to find peptides that bind 
receptors under special conditions. 

Althouch the screening method is highly specific, 
the procedure generally dees nor discriminate between peptides 
of modest affinity (nicromciar dissociation constants) and 
those of high affinity (nanomolar dissociation constants or 
greater) . The ability to select peptides with relatively low 
affinity may be the result of multivalent interaction between 
a vector/fusion protein complex and a receptor. For instance, 
when the receptor is an IgG antibody, each complex may bind to 
mere than one antibody binding site, either by a single 
complex binding through the multiple peptides displayed to 
both sites of a single IgG molecule or by forming a network of 
complex-IgG. Multivalent interaction produces a high avidity 
and tenacious adherence of the vector during washing. 
Multivalent interactions can be mimicked by using a high 
density of immobilized monovalent receptor. 

To enrich for the highest affinity peptide ligands, 
a substantially monovalent interaction between vector and the 
receotor (typically immobilized on a solid phase) may be 
aocrouriate. The screening (selection) with substantially 
monovalent interaction can be repeated as part of additional 
rounds of amplification and selection of vectors. Monovalent 
interactions may be achieved by employing lew concentrations 
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of receptor, such as the Fab binding fragment cf an ar.tibcdy 
molecule. 

A strategy employing a combination of conditions 
favoring multivalent or monovalent interactions can be used to 
5 advantage in producing new peptide ligands for receptor 

molecules. By conducting the first rounds of screening under 
conditions to promote multivalent interactions, one can then 
use high stringency washing to reduce greatly the background 
of non-specif ically bound vectors. This high avidity step may 

10 select a large pool of peptides with a wide range of 

affinities, including those with relatively low affinity. 
Subsequent screening under conditions favoring increasingly 
monovalent interactions and isolation cf plasmid complexes 
based on a slow dissociation rate may then allow the 

15 identification of the highest affinity pep -ides. 

After washing the receptor-fusion protein-vector 
complexes to select for peptides cf the desired affinity, the 
vector DIi A is then released from bound complexes by, for 
example, treatment with high salt or extraction with phenol, 

2 0 or both. For the lac system, one can use I?TG, a compound 
known to decrease the stability of the lac repressor-iacO 
complex, to dissociate the plasmid fro- the fusion protein. 
In a preferred embodiment, the elution buffer is composed of 
1 mM I?7G, 1G ~g/ni of a double-stranded oligonucleotide that 

25 contains lacOs, and 0.2 M KC1 . Once released from bound 
complexes, the piasmids are reintroduced into 5. coli by 
transformation. 3ecause of the high efficiency, the preferred 
method of transformation is electroporation . Using this new 
peculation of transf ormants , one can repeat additional cycles 

30 of panning to increase the proportion of peptides in the 

population that are specific for the receptor. The structure 
of the binding peptides can then be determined by sequencing 
the 3' region of the lad fusion gene. 

As noted above, antibody D32.39 and the pMC3 complex 

35 serves as a receptor- 1 igand positive control in panning 

experiments to determine ability to recover piasmids based on 
the sequence of the fusion peptide. Useful negative controls 
are pMC5 , which encodes only the linker fusion peptide (GADGA 
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[SZQ ID NO:65]), and pMCl, which encodes the dynorphin 3 
peptide, but lacks the lacO sequences carried by pMC3 and 
pMC5 . Lysates of E . coli strains carrying each plasmid were 
panned on D32.39 immobilized on polystyrene petri dishes. 
5 After washing, plasnids were recovered from complexes bound to 
the plates by phenol extraction, followed by transformation of 
Z. coli. 

The results with pure lysates demonstrated about 100 
fold more transf ormants recovered from pMC3 lysates as 

10 compared to the negative controls. The results with mixed 

lysates revealed enrichment of pMC3 versus controls among the 
population of recovered plasmids. The results with cells that 
were mixed before lysis yielded similar results. These 
results show that the p lasnid-lacl-peptide complexes were 

15 sufficiently stable to allow enrichment of plasmids on the 
basis of the peptide the plasmids encode. 

The random dcdecapeptide library in pMC5 described 
above was used in the screening method of the invention to 
identifv vectors that encode a fusion protein that comprised a 

20 peptide that would bind to D32.39 antibody coupled to sheep 
antimouse antibody coated magnetic beads. The number of 
complexes added to the beads at each round of panning yielded 
the ecuivaler.t of 10 :: to 10 11 trans ferments (see Example 3) . 
After panning, the recovered plasmids yielded trans foments 

25 ranging in number from about 10 3 in early rounds to almost 

1C 11 in the fourth and final round. Compared to the number of 
transf crmants from antibody panned complexes, panning against 
unmodified polystyrene beads produced orders of magnitude 
fewer trans f ormants . 

30 The above results demonstrate that the DMA binding 

activity of lac repressor can act as a link between random 
peptides and the genetic material encoding them and so serve 
as the base on which to construct large peptide ligand 
libraries that can be efficiently screened. In the screening 

25 process, plasmid-repressor-peptide complexes are isolated by 
panning on immobilized receptor, the plasmids are amplified 
after transformation of Z. coli, and the procedure is repeated 
to enricn for olasmids encoding peptides specific for the 



W0 96/40987 PCT/US96/09S09 

26 

receptor. The repressor binds to the library plasmid with 
sufficient avidity to allow panning -of the library on 
immobilized receptor without problematic levels of 
dissociation. This system can be used tc identify a series of 
related peptides that bind to a monoclonal antibody whose 
epitope has not been characterized and to identify peptide 
ligands for other receptors. 

Once a peptide ligand of interest has been 
identified, a variety of techniques can be used to diversify a 
peptide library to construct ligands with improved properties. 
In one approach, the positive vectors (those identified in an 
early round of panning) are sequenced to determine the 
identity of the active peptides. Oligonucleotides are then 
synthesized based cn these peptide sequences, employing ail 
bases at each step at concentrations designed to produce 
slight variations of the primary oligonucleotide sequences. 
This mixture of (slightly) degenerate oligonucleotides is then 
cloned into the random peptide library expression vector as 
described herein. This method produces systematic, controlled 
variations of the starting peptide sequences but requires, 
however, that individual positive vectors be sequenced before 
mutagenesis. This method is useful for expanding the 
diversity of small numbers of recovered vectors. 

Another technique for diversifying a selected 
peptide involves the subtle mis incorporation of nucleotide 
changes in the coding sequence for the peptide through the use 
of the polymerase chain reaction (?C?0 under low fidelity 
conditions. A protocol described in Leung e: al . , Technique 
I, 11-15 (1935), utilizes altered ratios of nucleotides and 
the addition of manganese ions to produce a 2% mutation 
frequency . 

Yet another approach for diversifying a selected 
random peptide vector involves the mutagenesis of a pool, or 
subset, of recovered vectors. Recombinant host cells 
transformed with vectors recovered from panning are pooled and 
isolated. The vector 2N'A is mutagenized by treating the cells 
with, e.g., nitrous acid, formic acid, hydrazine, or by use of 
a mutator strain as described below. These treatments produce 
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a variety of mutations in the vector DNA. The segment 
containing the sequence encoding the variable peptide can 
optionally be isolated by cutting with restriction nuclease (s) 
specific for sites flawing the variable region and then 
reclonsd into undamaged vector DNA. Alternatively, the 
mutagenized vectors can be used without recloning of the 
rautagenized random peptide coding sequence. 

In the second general approach for diversifying a 
set of peptide ligands, that of adding additional amino acids 
to a peptide or peptides found to be active, a variety cf 
methods are available. In one, the sequences of peptides 
selected in early panning are determined individually and new 
oligonucleotides, incorporating all or part of the determined 
secruencs and an adjoining degenerate sequence, are 
synthesized. These are then cloned to produce a secondary 
1 ibr ary . 

In another approach that adds a second variable 
region to a pool of random peptide expression vectors, a 
restriction site is installed next to the primary variable 
racicn. Preferablv, the enzyme should cut outside of its 
recognition sequence, such as 3spMI, which cuts leaving a four 
base 5 ' overhang, four bases to the 3' side cf the recognition 
s: - e> Thus, the recognition site may be placed four bases 
from the orimarv degenerate region. To insert a second 
variable region, a degenerately synthesized oligonucleotide is 
then ligated into this si~e to produce a second variable 
region juxtaposed to the primary variable region. This 
secondary library is then amplified and screened as before. 

While in some instances it may be appropriate to 
synthesize peptides having contiguous variable regions to bind 
certain receptors, in other cases it may be desirable to 
provide peptides having two or more regions of diversity 
separated by spacer residues. For example, the variable 
regions may be separated by spacers that allow the diversity 
domains of the peptides to be presented to the receptor in 
different ways. The distance between variable regions may be 
as little as one residue or as many as five to ten to up to 
about 100 residues. For probing a large binding site, one may 
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construct variable regions separated by a spacer containing 20 
to 3 0 amino acids. The number of spacer residues, when 
present, will preferably be at least two to three cr more but 
usually will be less than eight to ten. An oligonucleotide 
library having variable domains separated by spacers can be 
represented by the formula: (NNX) y - ( abc) R - (NNX) , , where >; and 
K axe as defined previously (note that S as defined previously 
may be substituted for X) ; y + z is equal to about 5, 6, 7, a, 
or more; a, b and c represent the same or different 
nucleotides comprising a codon encoding spacer amino acids; 
and n is up to about 20 to 30 codons or more. 

The spacer residues nay be somewhat flexible, 
comprising oiigcglycine, for example, to provide the diversity 
domains of the library with the ability to interact with sites 
in a large binding site relatively unconstrained by attachment 
to the DKA binding protein. Rigid spacers, such as, e.g., 
oligoproline , may also be inserted separately or in 
combination with other spacers, including glycine residues. 
The variable domains can be close to one another with a spacer 
serving to orient the one variable domain with respect to the 
other, such as by employing a turn between the two sequences, 
as might be provided by a spacer of the sequence Gly-?ro-Gly, 
for example. To add stability to such a turn, it may be 
desirable cr necessary to add Cys residues at either or both 
ends of each variable region. The Cys residues would then 
form disulfide bridges to hold the variable regions together 
in a loop, and in this fashion may also serve to mimic a 
cyclic peptide. Of course, those skilled in the art will 
appreciate that various other types of cava lent linkages for 
cyclization may also be accomplished. 

The spacer residues described above can also be 
encoded on either or both ends of the variable nucleotide 
region. For instance, a cyclic peptide coding sequence can be 
made without an intervening spacer by having a Cys codon on 
both ends of the random peptide coding sequence. As above, 
flexible spacers, e.g., o 1 igoglycine , may facilitate 
interaction of the random peptide with the selected receptors. 
Alternatively, rigid spacers may allow the peptide tc be 
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presented as if on the end of a rigid am, where the number of 
residues, e.g., Pro, de-ermines nor only the length of the arm 
but also the direction for the arm in which the peptide is 
oriented. Hydrcphiiic spacers, made up cf charged and/or 
5 uncharged hydrophilic amino acids, (e.g., Thr, His, Asn, Gin, 
Arg, Glu, Asp, Met, Lys , etc.)/ or hydrophobic spacers made up 
of hydrophobic amino acids (e.g., Phe, Leu, lie, Gly, Val, 
Ala, etc.) may be used to present the peptides to binding 
sites with a variety of local environments. 

xo The present invention can be used to construct 

improved spacer molecules. For example, one can construct a 
random peptide library that encodes a DNA binding protein, 
such as the lac repressor or a cysteine depleted lac repressor 
(described below) , a random peptide of formula NNK 5 (sequences 

15 up to and including NNK 10 or NMK 15 could also be used) , and a 
peptide ligand of known specificity. One would then screen 
the library for improved binding of the peptide ligand to the 
receptor specific for the ligand using the method of the 
present invention; fusion proteins that exhibit improved 

20 specificity would be isolated together with the vector that 
encodes them, and the vector would be sequenced to determine 
the structure cf the spacer responsible for the improved 
binding. 

Unless modified during or after synthesis by the 
25 translation machinery, recombinant peptide libraries consist 
of sequences of the 20 normal L-amino acids. While the 
available structural diversity for such a library is large, 
additional diversity can be introduced by a variety of means, 
such as chemical modifications of the amino acids. For 
3: example, as one source of added diversity a peptide library of 
the invention can be subjected to carboxy terminal amidation. 
Carboxy terminal amidation is necessary to the activity or 
many naturally occurring bioactive peptides. This 
modification occurs ir. vivo through cleavage of the N-C bond 
25 of a carboxy terminal Gly residue in a two-step reaction 
catalyzed by the enzymes peptidylg lycme alpha-amidation 
mcncoxygenase (?AM) and hydroxygiycine aminotransferase 
( KGAT ) . See Eipoer e: al • , J. Biol. Che-. 256, 7827-7333 
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(1991); Mizuno et al . , Biochem. 3iophys . Res . C o mm . 137(3), 
984-991 (1936); Murthy et al . , J * 3iol . Chez. 261(4), 
1315-1322 (1936); Katopcdis et al . , Biochemistry 29, 5115-5120 
(1990); and Young and Tamburini, J. Am . Chem. Soc. Ill, 
5 1933-1934 (1939) , each cf which are incorporated herein De- 
reference. 

Amidation can be performed by treatment: with 
enzymes, such as PAM and KG AT , in vivo cr in vitro, and under 
conditions conducive to maintaining the structural integrity 

10 of the fusion- protein/ vector complex. In a random peptide 
library of the present invention, amidation will occur on a 
library subset, i.e., those peptides having a carboxy terminal 
Gly. A library of peptides designed for amidaticn can be 
constructed by introducing a Gly codon at the end of the 

15 variable region domain cf the library. After amidation, an 

enriched library serves as a particularly efficient source of 
iigands for receptors that preferentially bind amidated 
"peptides. Many of the C-terminus amidated hioactive peptides 
are processed from larger pro-hormones, where the amidated 

20 peptide is flanked at its C-terminus by the sequence 

-Gly-Lys-Arg-X . . . [SZQ ID NO: 57] (where X is any amino 
acid) . Oligonucleotides encoding the sequence 

-Giy-Lys-Arg-X-Stop [SZ£ ID NO: 67] can be placed at the 3' end 
cf the variable oligonucleotide region. When expressed, the 

25 Gly-Lys-Arg-X [SZQ ID NO: 57] is removed by in vivo or ir. vitro 
enzymatic treatment, and the peptide library is carboxy 
terminal amidated as described above. 

Conditions for C-terminal amidation of the libraries 
cf the invention were developed using a model system that 

3 0 employed an antibody specific for the amidated C-terminus of 

the peptide cholecystokinin (CCX) . The reaction conditions tc 
make the peptide a-amidating nonooxygenase (PAM) enzyme active 
when used to amidate the libraries were developed using an 
125 I labeled small peptide substrate and an ZLISA with a 

3 5 positive control glycine extended CCK octamer peptide fused to 
the lac repressor. The I. coli strain used in the experiment 
carried plasmid pJS129, which encodes the cysteine free lac 
represser (described below) fused to the CCK substrate peptide 
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(DYMGWMDFG) . A panning lysate was mace from this strain using 
the standard panning protocol (see Example 5) . After 
concentrating the column fractions in a Csntriprep 100, four 
sancles were prepared, each containing 0.25 ml of lysate and 
0.25 nl of 2x PAM buffer (prepared by mixing 0.2 ml of 1 >! 
HZPES, pH 7.4, 0.9 ml of 20% lactose, 3.65 mi of H 2 0 , 
0.1 ml of a solution composed of 20 ing/ml catalase, 15.6 ^1 of 
6 M NaT, and 150 ^1 of 0.1 M ascorbic acid). PAM enzyme was 
added to the tubes and incubated at 37°C for 30 minutes. 
Then, 120 ul of 5% 3SA in KEKL buffer and 6 ul of herring DNA 
were added to each tube; the contents of each tube were then 
added to 6 microtiter wells that had been coated with 
2 ug/well anti-CCK antibody and blocked with 3SA. The 
microtiter plate was agitated at 4°C for 150 minutes, washed 
5x with cold KEKL , washed for 10 minutes with a solution 
composed of KEKL , 1% BSA , and 0.1 mg/ml herring DNA, and 
washed again 5x with cold HZXL. The plasmids were eluted 
using the standard protocol and used to transform E . ccli host 
ceils. The results shewed a dramatic increase in the recovery 
of plasmid transf crmants with increasing amounts of PAM 
enzyme, demonstrating that the amidation reaction worked. 

Cther modifications found in naturally occurring 
peptides and proteins can be introduced into the libraries to 
provide additional diversity and to contribute to a desired 
biological activity. ?cr example, the variable region library 
can be provided with ccdons that code for amino acid residues 
involved in phosphorylation, glycosylation , sulfation, 
isoprenylation (or the addition of other lipids), etc. 
Modifications not catalyzed by naturally occurring enzymes can 
be introduced by chemical means (under relatively mild 
conditions) or through the action of, e.cr., catalytic 
antibodies and the like. In most cases, an efficient strategy 
for library construction involves specifying the enzyme (cr 
chemical) substrate recognition site within or adjacent to the 
variable nucleotide region of the library so that most members 
of the library are modified. The substrate recognition site 
added can be simply a single residue [e.g., serine for 
phosphorylation) or a complex consensus sequence, as desired. 
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Conformational constraints, or scaffolding, can also 
be introduced into the structure of the peptide libraries. A 
number of motifs from known protein and peptide structures can 
be adapted for this purpose. The nethcd involves introducing 
5 nucleotide sequences that code for conserved structural 

residues into or adjacent to the variable nucleotide region so 
as to contribute to the desired peptide structure. Positions 
nonessential to the structure are allowed to vary. 

A degenerate peptide library as described herein can 

10 incorporate the conserved frameworks to produce and/ or 

identify members of families of bioactive peptides or their 
binding receptor elements. Several families of bioactive 
peptides are related by a secondary structure that results in 
a conserved "framework," which in some cases is a pair of 

15 cysteines that flank a string of variable residues. This 
results in the display of the variable residues in a loop 
closed by a disulfide rend, as discussed above. 

In some cases, a more complex framework that 
contributes to the bicsctivity of the peptides is shared among 

20 members of a peptide family. An example of this class is the 
ccnctcxir.s: peptide toxins of 10 to 30 amino acids produced 
by venomous molluscs known as predatory cone snails. The 
ccnctcxm peptides generally possess a high density of 
disulfide cross 1 inking . Of these that are highly crosslinked, 

25 most belong to two grcups, mu and omega, that have conserved 
primary frameworks as follows (C is Cys) : 

- u cc C C CC; and 

omega C C CC C C 

The number of residues flanked by each pair of Cys residues 
varies from 2 to 5 in -he peptides reported to date. The side 
chains of the residues that flank the Cys residues are 
apparently not conserved in peptides with different 
35 specificity, as in peptides from different species with 

similar or identical specificities. Thus, the conotoxins have 
exploited a conserved, densely crosslinked motif 



as a 
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framework for hypervariabie regions to produce a huge array of 
peptides with many different pharmacological effects. 

The mu and omega classes (with 6 Cys residues) have 
15 possible combinations of disulfide bonds. Usually only one 
5 of these conformations is the active ("correct") form. The 

correct folding of the peptides may be directed by a conserved 
4 0 residue peptide that is cleaved from the N-terminus of the 
conopeptide to produce the small, mature bioactive peptides 
that appear in the venom. 

10 With 2 to 6 variable residues between each pair of 

Cys residues, there are 125 (5 3 ) possible framework 
arrangements for the nu class (2,2,2, to 6,6,6), and 625 (5") 
possible for the omega class (2,2,2,2 to 6,5,6,6). 
Randomizing the identity of the residues within each framework 

15 produces 10 :G to >10 3C peptides. "Cono-iike" peptide 

libraries are constructed having a conserved disulfide 
framework, varied numbers of residues in each hypervariabie 
region, and varied identity of those residues. Thus, a 
sequence for the structural framework for use in the present 

20 invention comprises cys-Cys-Y-Cys-Y-Cys-Cys, or 

Cys-Y-Cys-Y-Cys-Cys-Y-Cys-Y-Cys, where Y is (MNX) X or (NN"5) X ; 
N is A , c, G or T ; K is G or T; S is G or C; and x is from 2 
to 6. 

Framework structures that require the formation of 
25 one or more disulfide bends under oxidizing conditions may 
create problems with respect to the natural lac repressor, 
which has 3 cysteine residues. All 3 of these residues, 
however, can be changed to other amino acids without a serious 
effect on the function of the molecule (see Kleina and Miller, 
supra). Piasmid pJS12 3 is derived from plasmid pMC5 by site 
specific mutagenesis and encodes a lac repressor identical to 
the lac repressor encoded on piasmid pMC5 , except the cysteine 
codcn at position 107 has been changed to an serine ccdon; the 
cysteine codcn at position 140 has been changed to an alanine 
■5 ccdon (alanine works better than serine at this position); and 
the cysteine ccdon at position 231 has been changed to a 
serine ccdon. Piasmid ?JS123 (available in strain ART 161 
from the American Type Culture Collection under the accession 



10 
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number ATCC No. 63 319] is therefore preferred for construe ting 
random peptide libraries involving cysteir.e-Iinked framework 
structures . 

The lac repressor coding sequence in plasmid pJS123 
can be subjected to mutagenesis to improve the binding of the 
mutant coding sequence with lacO type sequences. A preferred 
method for performing this mutagenesis involves the 
construction of a coding sequence in plasmid pJS123 that 
encodes a fusion protein comprised of the cysteine depleted 
lac repressor, a spacer peptide, and a peptide ligand of known 
specificity. The resulting vector is subjected to mutagenesis 
by any of a variety of methods; a preferred method involves 
transformation of an E. cell mutator strain such as mutD5 (see 
Schaaper, Free. Natl. Acad. Sci. USA 35, 3125-3130 (1933), 
incorporated herein by reference) and culture of the 
transf ormants to produce the fusion protein encoded by the 
vector. The fusion proteins are screened by the present 
method to find vectors that have been mutated to increase the 
binding affinity of the cysteine depleted lac repressor to the 
lacO sequence. One could combine this method with the method 
of constructing improved spacers, describee above, to select 
for an improved cysteine depleted lac repressor-peptide spacer 
molecule . 

In such a fashion, plasmid pJ3123 was modified to 
create plasmid pJ3123, which was then introduced into a muoO 
mutator strain. Oligonucleotides were then cloned into the 
mutagenized vector to encode a D32.39 epitope joined to 
repressor via a random region of 5, 10, or 15 amino acids. 
This library was panned on D32.39 antibody for 5 rounds under 
increasingly stringent conditions. Individual clones were 
selected from the population of piasmids surviving after the 
fifth round and tested by a variety of assays. These assays 
included: (1) tests for ability to repress the chromosomal 
lac operon (a test of DNA binding affinity); (2) tests for 
plasmid copy number; (3) ELISA with 032.39 antibody to test 
for display of the peptide epitope; and (4) tests of plasmid 
recovery during panning. Several of these piasmids were 
sequenced in the random tail region to determine the structure 
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of the linker peptide. A series of subcioning experiments 
were also conducted to determine regions of the plasties that 
determined the observable properties of the plasmids. 
Finally, plasmids carrying a higher copy number replication 
origin and encoding one of the linker regions were constructed 
and sequenced to ascertain that no base changes in the 
cysteine free repressor gene, as compared to the starting 
plasmid, were introduced. The linker tail of this plasmid and 
the cloning strategy for random libraries is shown in Fig. 4. 
Two versions of the vector were constructed, one with the 
cysteine-free lac repressor gene (ARI24 6/pJS14 1 ; ATCC 
No. 69033) and one with the wild-type lac repressor gene 
(ARI230/pJS142; ATCC Nc . 69037). (These cell lines will be 
maintained at an authorized depository and replaced in the 
event of mutation, nonviabiiity or destruction for a period of 
at least five years after the most recent request for release 
of a sample was received by the depository, for a period of at 
least thirty years after the date of the deposit, or during 
the enforceable life of the related patent, whichever period 
is longest. All restrictions on the availability to the 
public of these cell lines will be irrevocably removed upon 
the issuance of a patent from the above-captioned 

application . ) 

ARI246 has the genotype =:. coll 3 lon-il sulAi 
hsdRll A(c^pT-fepC) ^c:pA319::*an IacI42::TnlO lacZUiiS. The 
lon-il, A(cmpT-repC) , and A c!pA3 19 : : kar. mutations destrcy 
three genes involved in proteolysis, so this strain should 
allow greater diversity of peptides to be expressed on the 
library particles. The suIAl mutation suppresses the 
fomentation phenotype caused by the Icr.-U allele. The 
hsdRll mutation destroys the restriction system to allow more 
efficient transformation of unmodified DKA. The lacI42::Tn!0 
mutation eliminates expression of the chromosomal lac 
repressor gene to prevent competition of wild-type repressor 
for binding sites on the library plasmids. The lacZUllS 
allele stops expression of B-galactosidase, which would 
otherwise be constitutive in the 2acI42::TnI0 background, 
leading to unnecessary use of ceil resources and reducing 
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growth rates. E. coli 3 ceils grow more quickly than K12 
ceils and yields excellent: electrocompetent cells for 
transformation. Transformation frequencies of around 5 x 
10 10 tf / uq of Biuescript piasmid DNA can be frequently 
5 observed with ARI246 cells. ARI230 has the sane genotype as 
A3I24S, except that the iacl mutation has been converged to a 
deletion by selecting for loss of the TnlO insertion, and a 
recAncat mutation has been introduced. The recA::cat 
mutation is useful to prevent homologous recombination between 

10 plasmids. As a consequence, the library plasmids exist more 
frequently as monomers, rather than multimeric forms that can 
be observed in ARI246. The monomers are better for two 
reasons: monomers reduce the valency of peptides per library 
particle, allowing more stringent selection for higher 

15 affinity peptide ligar.ds; and growth as monomers increases the 
number of plasmids per amount of DNA , increasing the number of 
library equivalents that can be panned against receptors. The 
recA::cat mutation makes the strain less healthy, so growth 
rates are slower, and the transformation frequency is reduced 

20 to about 2 x 10 10 tf />c . 

Other changes can be introduced to provide residues 
that contribute to the peptide structure, around which the 
variable amino acids are encoded by the library members. For 
example, these residues can provide for alpha helices, a 

25 helix-turn-helix structure, four helix bundles, a beta-sheet, 
or other secondary or tertiary structural (framework or 
scaffolding) motifs. See U.S. S.N. 07/713,577, filed June 20, 
1991, incorporated herein by reference. DNA binding peptides, 
such as those that correspond to the transcriptional 

30 transactivators referred to as leucine zippers, can also be 
used as a framework, provided the DNA binding peptide is 
distinct from the DNA binding protein component of the fusion 
protein and the library vector does not contain the binding 
site for the DNA binding peptide. In these peptides, leucine 

35 residues are repeated every seven residues in the motifs, and 
the region is adjacent to an alpha helical region rich m 
lysines and arginines and characterized by a conserved helical 
face and a variable helical face. 
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Other specialized ferns of structural constraints 
can also be used in the present invention. For example, 
certain serine proteases are inhibited by small proteins of 
conserved structure (e.g., pancreatic trypsin inhibitor). 
5 This conserved framework can incorporate degenerate regions as 
described herein to generate libraries for screening for novel 
protease inhibitors. 

In another aspect related to frameworks for a 
peptide library, information from the structure of known 

10 ligands can be used to find new peptide ligands having 

features modified from those of the know- ligand. In this 
embodiment, fragments of a gene encoding a known ligand, 
prepared by, e.g., limited DNAse digestion into pieces of 20 
to 100 base pairs, are subcloned into a variable nucleotide 

15 region system as described herein either singly or in random 
combinations of several fragments. The fragment library is 
then screened in accordance with the procedures herein for 
binding to the receptor to identify small peptides capable of 
binding to the receptor and having characteristics which 

20 differ as desired from the parental peptide ligand. This 
method is useful for screening for any receptor-ligand 
interaction where one cr both members are encoded by a gene, 
e.g., growth factors, hormones, cytokines and the like, such 
as' insulin, inter leukins , insulin-like growth factor, etc. In 

2 5 this embodiment cf the invention, the peptide library can 

contain as few as 10 to 100 different members, although 
libraries of 1000 or more members will generally be used. 

Thus, the present invention can be used to construct 
settide ligands of great diversity. The novel features of the 
3C oreferred embodiment of the invention, called "peptides on 
nlasmids", in which the lac repressor is the DNA binding 
orotein and a plasmid vector encodes the fusion protein, are 
distinct from those of the previously described phage 
libraries. The random peptides of the present libraries can 

3 5 be displayed with a free oarboxy terminus instead of being 

displayed at the amino terminus or internal to the carrier 
protein and so add diversity to the peptide structures 
available for receptor binding. The presentation of 
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ligands at the carboxy terminus also facilitates amidaticn, as 
discussed above. This made cf display also ensures that stop 
codons in the degenerate region, which occur more often in 
longer degenerate oligonucleotides, shorten rather than 
5 destroy individual clones. The presence cf stop codons in the 
random peptide coding sequence actually serves to create 
additional diversity, by creating peptides of differing 
lengths. The lac repressor fusions of the invention also 
allow the display of potential ligands with a wide range cf 
10 sizes. 

In addition, these lac repressor fusions are 
cytoplasmic proteins, unlike the phage fusions, which are 
exported to the periplasm. The use of both fusion methods 
increases total available peptide diversity, because the two 

15 types of libraries are exposed to different cellular 

compartments and so are exposed to different sets of 5. ccli 
proteases and to different reduction/ oxidation environments. 
There is no need, however, for peptides fused to the lac 
repressor to be compatible with the protein expert apparatus 

20 and the formation of an intact phage coat. The peptides need 
simply be compatible with the formation of at least a 
repressor dimer, which is the smallest form of the protein 
that can bind DNA (see Daly and Matthews, Biochez. 25, 
5474-5473 ( 1935); and Xania and Brown, Prcc. .Vat:. Acad. Sci . 

25 USA 73, 3529-3533 (1975)). 

As in the phage system, the lac repressor fusion 
library displays multiple copies cf the peptide on each 
library particle. Each repressor tetramer, in principle, 
displays four peptides that are available for binding to 

30 receptors. In addition, each piasmid monomer can bind up to 
two tetramers (if no loop is formed), and multimers of the 
piasmid can display higher multiples of two tetramers. This 
multivalent display allows the isolation of ligands with 
moderate affinity (micromciar K d , see Cwiria et al . , supra). 

35 For receptors with known, high affinity peptide ligands, these 
moderate affinity ligands can obscure the high affinity ones 
simply because of their greater numbers. This problem can be 
overcome bv immobilizing monovalent receptors at low density, 
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which allows high affinity (nanomolar K d ) iigar.ds to be 
identified, as discussed above. For receptors whose nerval 
ligands are not small peptides, however, this multivalency of 
display will be an advantage for identifying initial families 
5 of moderate affinity ligands, which can then be optimized by 
additional rounds of screening under monovalent conditions. 
The multivalency of ligand display therefore allows the 
isolation of peptides with a wide range of affinities, 
depending on the density of the receptor during the panning 
10 procedure. 

Libraries of peptides produced and screened 
according to the present invention are particularly useful for 
napping antibody epitopes. The ability to sample a large 
number of potential epitopes as described herein has clear 

15 advantages over the methods based on chemical synthesis now in 
use and described in, among ethers, Geysen e z al . , J. Immunol. 
Xeth. 1C2, 259-274 (1937). In addition, these libraries are 
useful in providing new ligands for important binding 
molecules, such as hormone receptors, adhesion molecules, 

2 0 enzymes, and the like. 

The present libraries can be generalized to allow 
the screening of a wide variety of peptide and protein 
ligands. In addition, "he vectors are constructed so that 
screening of other ligands encoded by the clasmid is possible. 

2 5 For example, the system can be simply modified to allow 

screening of RNA ligands. A known RNA binding protein (e.g., 
a riboscmal protein) is fused to the DNA binding protein. A 
promoter elsewhere on the vector drives expression of an RNA 
molecule composed of the known binding site for the RNA 
:: binding protein followed by random sequence. The DMA -RNA 

binding fusion protein would link the genetic information of 
the vector with each member of a library of RNA ligands. 
These RNA ligands could then be screened by panning 
techniques . 

3 5 Another large class of possible extensions to this 

technique is to use a mcdified version of the vector to 
isolate genes whose products modify peptides, proteins, or RNA 
in a desired fashion. This requires the availability of a 
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receptor that binds specifically to the modified product. Fcr 
the general case, a connection is made between the plasmid and 
the suhstrate peptide, protein, or RNA, as described above. 
The plasmid is then used as a cloning vector to make libraries 
5 of DMA or cDNA from a source with the potential to contain the 
desired modification gene (specific organisms, PCR amplified 
antibody genes, etc.) under the control of a promoter that 
functions in E. coll. Plasmids carrying the gene in question 
could then be isolated by panning lysates of the library with 

10 the receptor specific fcr the modified product. 

For example, a gene encoding an enzyme that cleaves 
a particular amino acid sequence could be isolated from 
libraries of DNA from organisms that might have such a 
protease or from amplified antibody cDN'A . An antibody for use 

15 as the receptor would first be made to the peptide that would 
remain after the desired cleavage reaction had taken place. 
Many such antibodies will not bind to that peptide unless it 
is exposed at the N- or C-terminus of the protein. The ceding 
sequence fcr the unc leaved substrate sequence would be 

2 0 attached to the DNA binding protein ceding sequence in a 
vector. This vector would be used to make an expression 
library from an appropriate source. Members cf this library 
containing a gene that encoded an enzyme able to cleave the 
peptide would cleave only the peptide attached to the plasmid 

2 5 with that gene. Panning of lysates of the library would 

preferentially isolate those plasmids with active genes. 

Selection of DN'A Binding Proteins bv Forced Evolution 

Although seme DNA binding proteins for use in the 
2 0 invention are obtained directly from the repertoire of natural 
DNA binding proteins, other DNA binding proteins are selected 
by a process termed forced evolution. Forced evolution 
selects a DNA binding protein optimal for use in the peptides 
on plasmid screening methods described elsewhere in the 

3 5 specification. The functional properties that allow a DNA 

binding protein to survive the forced evolution process are 
the very same properties that confer optimum capacity to 
screen peptides in the peptides on plasmids method. Thus, the 
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forced evolution process does not require prior knowledge of 
the binding mechanism of a DNA binding protein or prior 
definition of criteria (e.g.., dissociation half-life, 
conformation) on which the efficacy of a DNA binding protein 
5 in the peptides on piasmids method is founded. 

The method for performing forced evolution of a DNA 
binding protein is closely analogous to the methods of 
screening peptides on piasmids. The main difference between 
screening peptides on piasmids and forced evolution lies in 

10 whether the peptide component or the DNA binding component of 
a fusion protein is varied in different members of a library. 
In the peptides on plasmid method, the DNA binding protein is 
constant and the peptide moiety varies in different members of 
the library. The methods select a peptide with specific 

15 affinity for a receptor. 

In the forced evolution method, the peptide is 
constant between different members of a library, and the DNA 
binding protein varies between members. As in the peptides on 
piasmids method, cells are transformed with libraries of 

20 vectors encoding fusion proteins. The fusion proteins 

comprise a potential DNA binding protein fused to the constant 
peptide. The cells are cultured under conditions in which the 
fusion proteins are expressed. If a fusion protein comprises 
a potential DNA binding protein that in fact has an affinity 

25 for the vector encoding it, the fusion protein binds to the 
vector to form a complex. The ceil are lysed releasing 
complexes . 

Complexes are screened by affinity purification on a 
receptor known to bind the peptide present in all of the 

3: complexes. Vectors are purified from complexes binding to the 
receptor via the peptide, amplified (e.g., by retransf ormation 
or ?CR) and sequenced to reveal the identity of DNA binding 
proteins that have survived the selection process. To have 
survived the selection process, a DNA binding protein must 

3 5 have two properties: (1) capacity to remain complexed with the 
vector encoding it throughout the screening process; and 
(2) capacity to display the peptide with a conformation 
suitable for interaction with its receptor. These are the 
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sane properties that: make a DNA binding protein useful for 
displaying a peptide in the peptides on plasties method. 

(1) Sour ces cf Potential OKA 3ir.dinc Proteins 
The oligonucleotides encoding the potential DNA 
binding proteins can derive from a number of sources. Often, 
one starts with a natural DNA binding protein, in which case, 
the different potential DNA binding proteins represent 
variants of the natural DNA binding protein. Variants of a 
natural DNA binding protein can be produced by PCR mutagenesis 
of a DNA sequence encoding the protein. PCR mutagenesis can 
result in a low rate of mutagenesis at any position of the 
coding sequence. Thus, typically, each potential DNA binding 
protein shows a high degree of sequence identity (e.g., at 
15 least 95 or 93% sequence identity) with the natural protein, 
but the collective library include variants at all or nearly 
all of the positions in :r.e protein. PCR mutagenesis is 
particularly suitable for natural DNA binding proteins which 
have net been extensively characterized, and for which there 
20 is little information about which amino acid residues are 

critical for binding. 7 or other DNA binding proteins, such as 
lad, for which prior studies have already identified certain 
positions as being important for binding, mutagenesis can be 
rccussed on these positions. For example, the coding sequence 

2 5 of the natural protein can be synthesized on a DNA 

synthesizer, but with the introduction of randomized codons at 
the critical loci for binding. 

The methods can screen multiple natural DNA binding 
proteins, or variants thereof, simultaneously. The methods 
3: can also screen potential binding proteins containing repeated 
copies cf a natural binding domain or binding domains obtained 
from mere than one natural protein. The potential DNA binding 
proteins can also be variants cf a consensus DNA binding 
protein sequence or any theoretical DNA sequence thought to 

3 5 have DNA binding properties. The potential DNA binding 

proteins can also constitute random sequences from an epitope 
library encoding all or a substantial number of all possible 
peptide epitopes of a given length. 
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(2) DNA 3indinc Se^erces 

Surprisingly, it has been found that the forced 
evolutionary method is sufficiently powerful that it can 
select variants of a natural binding protein that bind with a 
different specificity than the natural protein and vet show 
improved characteristics for use in the peptides on plasmid 
selection method. Thus, in general there is no need to 
include a specific DNA binding sequence known to show a 
specific affinity for a natural DNA binding protein in the 
recombinant DNA vector. Likewise, the DNA binding protein 
need not show a strong preference for a specific sequence. 
Thus, nonsequence-specif ic DNA binding proteins such as 
histones are suitable. 

In seme applications, however, it is desirable to 
evolve a DNA binding protein having specificity for a 
predetermined sequence. In such applications, one includes 
this sequence in the recombinant vector and screens potential 
DNA binding proteins that are variants of the natural protein 
having affinity for that sequence. As discussed below, the 
conditions of selection can be tailored to drive evolution in 
favor of variants retaining affinity for the specific sequence 
and showing improved characteristics relative to the natural 
DNA binding protein. If variants of multiple natural DNA 
binding proteins are being screened simultaneously, a separate 
vector is constructed for each DNA binding protein, containing 
the recognition sequence for that binding protein. Families 
of oligonucleotide variants are then produced separately for 
each DNA binding protein and cloned into the vector encoding 
that binding protein and the corresponding recognition 
sequence. At this point, ail the vectors can be mixed and 
transformation and selection can proceed as for screening 
variants to a single natural binding protein. 

There are a number of strategies to drive evolution 
toward selection of DNA binding proteins having specificity 
for a given sequence. Fcr example, the affinity purification 
step can be performed in the presence of a large excess of DNA 
lacking the specific oinding site present in the vector. 
Idealiv, this DNA constitutes a derivative of the vector from 
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which the specific binding sire for the DNA binding protein 
has been deleted. However, bulk DNA from commercial sources, 
such as herring or salmon sperm DNA, is generally adequate. 
The presence of DNA lacking the specific binding site in the 
screening buffer accelerates dissociation of DNA binding 
proteins bound other than at the specific site, resulting in a 
enrichment for complexes containing DNA binding proteins with 
affinity for the specific site. 

In certain instances, retention of sequence-specific 
binding can be ensured by in vivo selection. For example, 
variants of a lad DNA binding protein can be propagated m an 
3 . coli strain having a defective chromosomal lacl gene on 
media containing the chronogenic substrate X-gal. Variant DNA 
binding proteins retaining affinity for the lacO operator 
repress expression of 3-glactosidase and thus, do nor 
metabolize the X-gal to a blue-colored product.. Variant DNA 
binding proteins having lest affinity for the lacO operator 
express 3 -glactosidase , and give rise to blue colonies. 

(3) Optimization cf Linkers 

In the fusion proteins used in the above methods, 
the DNA binding can be separated from the peptide by a peptide 
linker. Similarly, if the DNA binding protein has more than 
one domain, the domains can be separated by additional 
linker ( s) . Optimal peptide linkers for use in these methods 
can be selected by an the same forced evolutionary process as 
DNA binding proteins are selected. Linker(s) can be mutated 
and screened contemporaneously with the DNA binding protein. 
For examole, a segment of DNA encoding both linkers and a DNA 
binding protein can be subjected to FCR mutagenesis. 
Alternatively, linkers can be mutagenized and screened before 
or after optimizing the DNA binding protein with which they 
are to be used. 

(4) Selection cf Fusion Sites 

Generally, peptides are fused either at or near the 
N- or C-terminus of a DNA binding protein, because these sites 
offer the least constrained display cf peptide with the least 
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likelihood cf disrupting the DNA binding protein. However, 
other viable sites of insertion are readily selected using the 
forced evolution method. For example, a library of vectors is 
constructed in which a common peptide is inserted at different 
5 sites in a DNA binding protein under test. The library cf 

vectors is transformed and propagated in host cells, and the 
vectors isolated and panned for binding to the peptide 
receptor via the displayed peptide. The vectors binding to 
the receptor are those in which the site of insertion resulted 
10 in display of the peptide without disrupting the binding 
characteristics of the DNA binding protein. 

(5) Successive Rounds of Enrichment 

A single round of propagation and affinity 

15 purification of a library of potential DNA binding proteins 
selects a pool cf vectors encoding DNA binding proteins that 
are at least somewhat useful for screening peptides in the 
peptides on plasnids method. Further optimized DN r A binding 
proteins are obtained by performing successive rounds cf 

23 enrichment. That is, vectors present in complexes bound to 
the receptor in the screening process are isolated, 
retransf =rned into host cells, and the selection process is 
repeated. Each round cf selection results in greater 
enrichment for DNA binding proteins having optimal 

25 characteristics for use in screening peptides, because the 

vectors encoding these proteins are statistically most likely 
to survive the selection. The stringency of binding and wash 
buffers can be increased in successive rounds of screening as 
is the case when screening peptide libraries. Typically, 

2 j vectors surviving four rounds cf affinity selection encode DNA 
binding protein having highly suitable characteristics for 
peptide display. 

In general, a DNA binding protein surviving the 
evolutionary process is optimized for use in the peptides on 

3 5 plasmids method under the same conditions as those employed in 
the evolutionary process. Thus, the same or similar 
conditions should be employed in subsequent use of a DNA 
binding protein m the peptides on 
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used in selecting the DNA binding protein. The conditions 
employed during evoluticn cf DMA binding proteins can be 
changed to customize DNA binding proteins for different 
purposes. For example, elution of complexes bound to the 
screening receptor by competition with free peptide biases the 
selection toward survival of DNA binding proteins monovaientiy 
bound to the receptor. 

Suitability can be quantified by the enrichment 
ratio conferred by a DNA. binding protein. A vector encoding 
the DNA binding protein fused to a peptide is transformed into 
host cells and propagated as described previously to form 
complexes between the vector and fusion protein. Cells are 
iysed and the complexes are screened for binding to a receptor 
having affinity for the peptide, and (separately) to a 
receptor lacking affinity for the peptide. Vectors are 
recovered from bound complexes in the two situations and 
transformed into host cells. The enrichment ratio is that 
ratio of transf ormants resulting from screening with the 
receptor having affinity for the peptide divided by 
transf ormants from screening with the receptor lacking 
affinity for the peptide. 

(f) Other Vsas of DNA 3indinc Proteins 
As well as being ideal for use in the peptides on 
plasmids screening method, DMA binding proteins resulting from 
forced evoluticn have a number of other uses. For example, 
DNA binding proteins car. be used as carriers for transfer of 
DNA into cells. See WO 34/25508. DNA binding proteins 
customized to bind a specific sequence unique to a pathogenic 
microorganism, such as HIV, as also useful for therapeutic 
intervention and/or diagnosis of such an organism. See, e.g., 
Ladner at al . , US 5,193,345. 

As can be appreciated from the disclosure above, the 
present invention has a vide variety of applications. 
Accordingly, tne following examples are offered by way or 
illustration, net by way of limitation. 
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example I 

Construction of Plasmids oMC2 and cMC5 
The bacterial strains used were Z. cell K12 strains 
MC1C51 (araD139 a (araA3C-Ieu) 7696 rhr ^IacX74 galU g-alX 
5 ksdR ^cr3 rpsl(strA) chi) , AHI 20 (?' lac" pro* lacIcLS 
Iaclam74 // ^(lac-crc] thi rpsL(strA) recA : : cat ) , and 
XLl-31ue ( F ' pre A3 iaclq lacZDMlo TnlO // recAl endAl 
gyrA96 thi hsc?H17 supZ44 relAl lac) , and Z. coli B strain 
AH I 161 (lon-ll, suIAi, hs<?R17, a (ojnpT-f ep c ) * 

10 AclpA319 : :/can) . AH I 151 is a protease deficient strain and 

serves to minimize proteolysis of the peptides in the library, 
which would reduce the available diversify for panning. 
Mutations knovr. to reduce proteolysis include degP , Ion, htpR, 
omz/L , and clpA,?. 

13 The library plasnid pMC5 was constructed in several 

steps using piasmid p3AD13 as the starting plasmid. Plasmid 
p BAD 13 contains the ara3 promoter followed by a poly 1 inker and 
a terminator under the control of the positive/ negative 
regulator aumaC, also specified by the plasmid. Plasmid p3AD13 

23 also contains a modified piasmid p3R322 origin and the bla 

gene to permit replication and selection in 5. coli, as well 
as the phage Ml 3 intragenic region to permit rescue of 
sir.gle-stranded DNA for sequencing. 

The lad gene was modified for cloning into piasmid 

25 p3AD13 using the GeneAmp^ ?CR amplification kit ( Ferkin-Zlmer 
Cetus Instruments) with oligonucleotides ON-23S and ON-237, 
shown below: 

CN-236 5 1 -GCG GGC TAG CTA ACT AAT G G A GGA TAC ATA AAT GAA 

3: ACC ACT AAC GTT ATA CG-3 ' [SZQ ID NO: 53] 

ON-237 5 ' -CGT TCC GAG CTC ACT GCC CGC TCT CGA GTC GGG AAA 

CCT GTC GTG C-3 * [SZQ ID NO: £9]. 

The amplification reaction was carried out according to the 
3 5 manufacturer's instructions, except for the use of Vent lM DNA 
polymerase (New England Biolabs) . CN-235 contains a 
nonhomologous 5' region that adds an .Yhsl site, a consensus 
riocsome binding site [see Geld and Sterne, Methods in 
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Znzyziolcay (Goeddei, ec . , 3oston: Academe Press), pp. 39-103 
(1990) , incorporated herein by reference) , and changes the 
initiation codon of lacl from GTG to ATG. ON-237 changes 
codons 355 and 357 of lacl to an Xhol site through two silent 
mutations, and adds a Sad site after the lacl stop coden. 

Cloning of the Nhel , Sacl digested amplification 
product into plasnid p3AD13 produced vector pJSlOQ. Tvo IacO s 
sequences were added to this vector, with their centers spaced 
32 5 bp apart, by amplifying an unrelated sequence (the human 
D 2 dopamine receptor gene (see England et al . , FI3S Letz . 279, 
37-90 (1991), and U.S. S.N. 07/545,029, filed January 22, 1991, 
both of which are incorporated herein by reference) , with 
oligonucleotides ON-295 and ON-295, shown belcw: 

ON-295 5 '-COT CCA TAT GAA TTG TGA GCG CTC ACA ATT CGG TAG 

AGC CCC ATC CCA CCC-3 ' [SIQ ID NO: 70] 

ON-295 5 1 -CCC CAT CGA TCA ATT GTG AGC GCT CAC AAT TCA GGA 

TGT GTG TGA TGA AGA-3 ' [SIQ ID NO : 7 1 ] 

ON- 2 9 5 adds an Ndel site ar.d a lacO s secuer.ee at one end of 
the amplified fragment, and ON-295 adds a Clal site and lacO s 
at me other end. Cloning of the .Yd el to clal fragment into 
pTSlOO produced plasmid c~S:02. 

Plasnid pMC3 , encoding the dynorphm 5-tailei lac 
represser, was constructed by cloning complementary 
oligonucleotides ON -3 12 and ON-313 to replace the Xhol to 
XhaZ fragment at the 3' end of lacl in pCS!C2. These 
oligonucleotides add sequence encoding a five amino acid 
spacer (GADGA [SIQ ID NO: 55]) and dynorphm 3 ( YCGFLRSQ- -C/VT 
; SIQ 10 NO:7]) to the end of the wild-type lacl sequence, 
introduce an Sfil site in the sequence encoding the spacer, 
and are shown below: 

CN-312 5'-TCG AGA GC0 GGC AGG GGG CCG ACG GGG CCT ACG GTG 

GTT TOO TGC GTC GTO A3T TOA A AO TOO TAA COT AAT-3 ' 
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ON-213 5 ' -CTA GAT TAG GTT ACA ACT TTG AAC TGA CGA CGC AGG 

AAA CCA CCG TAG GCC CCG TCG GCC CGC TGC CCG 
CTC-3 ' [SEQ ID NO: 73] 

The library plasr^id pMC5 vas constructed by cloning 
complementary oligonucleotides ON-33 5 and CN-33 5 to replace 
the Sfil to Hlndlll dyncrphin 3 segnent of ?MC3 , as shewn in 
Fig. 2. Oligonucleotides ON-335 and ON-335 are shown below: 

ON-33 5 5'-GGG CCT AAT TAA TTA-3 ' [SEQ ID NO: 74] 

ON-3 3 6 5'-AGC TTA ATT AAT TAG GCC CCG T-3 f [SEQ 10 NO: 75] 

Plasmid pMC3 is available in strain ARI16 1 from the American 
Type Culture Collection under the accession number ATCC No. 
53313 . 

EXAMPLE 2 

Construction of a Pandcn D cdecamer Peptide Library 
Oligonucleotide CN-332 was synthesized with the 

sequence : 

5'-GT GGC GCC (NNX) 12 TAA GGT CTC G-3 1 , ] SEQ 10 NO: 75] 

where N is A , C, G, or 7 ( equirr.o lar ) and y is G or T (see 
Cwirla ez al . , supra). The oligonucleotide was purified by 
H?LC and phosphcrylated with T4 kinase (New England Biolabs) . 
The two half-site oligonucleotides ON-3 59 and ON-370 were 
phosphcrylated during synthesis and are shown below: 

ON-3 59 5' -GGC GCC ACC GT-3 1 [ SEQ ID NO: 77] 

ON-370 5'-AGC TCG AG A CCT TA-3 ' [ SEQ ID NO: 73] 

ON-359 and ON-370 annealed to CN-332 produce Sfil and HindZlI- 
ccnpatible ends, respectively, but the ligated product does 
net have either recognition sequence (see Fig. 2). 

Four hundred ptcles of each oligonucleotide were 
annealed in a 25 ul reaction buffer (10 ~M Tris, pH 7.4, 1 nM 
EDTA , 100 nM NaCi), by heating to 55>C for 10 air., and cooling 
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for 30 ruin, to room temperature. Vector pMC5 was digested to 
completion with Sfil and Hir.dlll, the vector backbone was 
isolated by 4 rounds cf washing with TE buffer (10 mM Tris, 
pH 3.0, 1 nM EDTA) in a Centriccn 100 micro concentrator 
(Ami con) by the manufacturer's instructions, fallowed by 
phenol extraction and ethanol precipitation. The annealed 
oligonucleotides were added to 64 micrograms of digested ?MC5 
at a 4:1 molar ratio in a 3.2 ml ligation reaction containing 
5% PEG , 3200 units of Sir.dlll, 194 Weiss units of T4 ligase 
(New England Bioiabs) , 1 m>! ATP, 20 mM Tris, pH 7.5, 10 nM 
MgCl2, 0.1 raM EDTA, 50 ^g/ml BSA, and 2 nM DTT. The reaction 
was split equally into S tubes and incubated overnight at 
15 3 C. 

After ethanol precipitation, 1/15 cf the iigated DMA 
(4 .ug) was introduced into MC1051 (30 pi) by electroporation 
(Cower ec al . , Nucl . Acids Res. 15, 6127-5145 (1933), 
incorporated herein by reference), to yield 5.5 x 10 3 
independent transf ormants . The library was amplified 
approximately 1000-fold m 1 liter cf L3/1Q0 jig/al ampiciliin 
by growth of the transf crmants at 3 7 'C to an A 500 of 1. The 
cells containing the library were concentrated by 
cer.trifugation at 5500 x g for 6 min., washed once in loe-cold 
50 mM Tris (pH 7.5), 1C mM EDTA, 100 mM KCl, followed by a 
wash in ice-cold 10 nM Tris, 0.1 nM EDTA , 100 mM KCl. The 
final pellet was resuspendec in 15 ml of HEG buffer (35 mM 
HEPES/KOH pH 7.5, 0.1 nM ECTA, 100 mM Na Glutamate) , 
distributed into 19 tubes cf 1.0 ml each, frozen on dry ice, 
and stored at -70°C. 

EXAMPLE 3 
o^nr^n the Library 
One aliquot (1.0 ml) of the library prepared in 
Example 2 was thawed on ice and added to 9 ml of lysis buffer 
(35 mM KZPES {pH 7.5 with XGH ; , 0.1 mM EDTA, 100 mM Na 
glutamate, 5% glycerol, 0.3 nc/mi 33A, 1 mM DTT , and 0.1 mM 
PMS?) . lysozyme was added (C.3 ml at 10 mg/mi in HEG), and 
tne mixture was incubated on ice for 1 hr . 
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The cellular debris was removed by centrif ugaticn of 
the lysate at 20,000 x g for 15 rain., and the supernatant was 
concentrated by centrif ugat ion in a Cer.tr iprep& 10 0 
concentrator (Amicon) at 500 x g for 4 0 rim . The concentrated 
supernatant (about 0.5 ml) was washed with 1G ml of HZG buffer 
and centrifuged as before. A sample (5%) of the total lysate 
was removed to determine the pre-panned input of plasmid 
complexes. 

An alternate method for partially purifying and 
concentrating the lysate is as follows. About 2.0 ml of the 
frozen cells in HZG are thawed on ice, and then 8 ml of lysis 
buffer without Na glutamate (high ionic strength inhibits 
lysozyme; DTT is optional) are added to the cells, and the 
mixture is incubated on ice for 1 hr . The cellular debris is 
removed from the lysate by centrif ugat ion at 20,000 x g fcr 
15 min. , and the supernatant is leaded onto a Sephacryls S-400 
High Resolution (Pharmacia; gel-filtration column (22 mm x 
25 3 mm) . The p lasmid-f us ion protein complexes elute in the 
void volume. The void volume (30 ml) is concentrated with two 
Centriprep* 1C0 concentrators, as described above. After 
adjusting the N'a glutamate concentration of the concentrate, 
one carries out the remainder of the procedure in the same 
manner as with the first method. 

Half of the remaining concentrated lysate was added 
to 032 . 33-antibcdy-coated sheep-anti-mouse ( 7c) -coupled 
magnetic beads (10 of D32.39 added to 5 mg Dynal beads for 
1 hr. at 25 3 C followed by 5 washes with HZG), and half was 
added to uncoated beads. After incubating the lysates with 
the beads at 0 3 C for 1 hr. with shaking, the beads were washed 
three times with 5 ml of cold HZG/ 0.1% 3SA and then three 
times with HZG using a MACS 0.5 tesia magnet (Miltenyi 3iotec 
Gm3H) to immobilize the beads. The plasmids were dissociated 
from the beads by phenol extraction, and after adding 20 iiq of 
glycogen (3oehringer Mannheim), the DN'A was precipitated with 
an equal volume of iscpropanol. The pellet was washed with 
75% ethanoi, and the D.N' A was resuspended in either 4 ui 
(panned ON A ) or 400 ul (pre-panned DN'A) of HyO . Strain MC10S1 
was transformed using 2 -1 each of the DN'A solutions to pemit 
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counts of recovered piasnids and anplif icaticn cf the selected 
plasmids. The results cf the panning are shewn below in 
Table 1. 



TA3L3 1 i 




Number of Trans fomants ) 


Panning 
Round 


Input 


Ab D3 2 .39 
Beads 


Uncoated 
Beads 


1 


1.6 x 10 L0 


9 x 10 7 | 1.7 x 10 s 


2 


1.4 X 10 11 


6.1 X 10 7 


1.2 x 10 4 j 


3 


1.7 X 10 11 


2.0 X 10 9 


40 j 


4 




1.5 X 10 11 


4 X 10* 4 I 



EXAMPLE 4 

3II5A Ar.alvsis cf the Library 
An ZIISA was used to test MC1C51 transf or:nants from 
the second, third, ana fourth rounds for D32 . 3 9-specif ic 
iigands (see Example 3). The ELISA was performed in a 96-weli 
plate (Bec-cnan) . Single colonies of transf orients obtained 
frcn panning were grown overnight in 15/ IOC ug/nl anpioiliin 
at 37 3 C. The overnight cultures were diluted 1/10 in 3 ml 
L3/10C -g/ni anpiciilm and grown 1 nr. The expression cf the 
lac repressor-peptide fusions was induced by the addition of 
ararinose to a final concentration of 0.2%. 

The ceils were lysed as described above in 1 nl of 
lysis buffer plus lysozyne and stored at -70 °C. Thawed crude 
lysate was added to each cf 2 wells (ICO ui/weii) , and the 
plate was incubated at 37 3 C. After 45 r.in, 100 pi of 1% 3SA 
in F33 (10 mM Na?04 , pH 7.4, 12 0 niM NaCi, and 2 . 7 nuM KCi) were 
added for an additional 15 rr.m. at 37 »C, followed by 3 washes 
with ?3S/0.05% Tween 20. Each well then was blocked with 1% 
55A m ?B5 (200 pl/well) for 30 mm. at 37°C, and the wells 
were washed as before. 

The primary antibody, D3 2.39 (100 ui of antibody at 
1 i-g/nl m ?3S/C.l ? s 35A) was added to each well, the plate was 
incubated at roon temperature for 1 hr., and then each well 
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was washed as before. The secondary antibody, alkaline 
phosphatase-con jugated Goat- an ti -mouse antibody (Gibcc-3RL) , 
was diluted 1/30G0 in ?33/ 0 , 1% 3SA and added to each well 
(100 /il/vell) ; the plate was then incubated for 1 hr at room 
5 temperature . After three washes with ?35/C.05% Tween 2C and 
two with TBS (10 mM Tris ?H 7.5, 150 mM NaCl) , the ZLISA was 
developed with 4 mg/ml p-nitrcphenyl phosphate in 1 M 
diethanolamine/HCl pH S.3, 0.24 mM MgCi 2 (200 ^1/weil) . 

The reaction was stopped after 5 min. by the 

10 addition of 2 M NaOH (50 yl/well) , and the absorbance at 
405 run was measured on a plate reader (a 3ionek, from 
Beclaan) . The positive control for the ZLISA was MC1061 
transformed with pNC3 , encoding the lac repressor— dynorphin B 
fusion. The negative controls were wells not coated with 

15 lysate. Background variability was calculated from the wells 
containing lysates fro- 15 colonies selected at random from 
the library, none of which scored significantly above the 
negative controls. Wells were scored as positive if the 
measured absorbance was at least two standard deviations above 

2 0 background. 

Cf randomly picked colonies, 3 5 of 53 (60%) tested 
positive by ZLISA : 11 cf 20 from round two, 12 of 16 from 
rcur.d three, and 12 cf 22 from round four. None of 15 random 
colonies from tne unpanr.ed library scored significantly above 
25 background. These data demonstrate the rapid enrichment cf 
specific iigands achieved by the present invention: after 
only two rounds cf panning, the majority cf plasmids encoded 
peptides with affinity for the D32.39 antibody. 

To determine the structure of the peptide ligands 

3 3 obtained by tne present method, plasmids from both ZLISA 

positive and ZLISA negative colonies obtained after panning 
were sequenced. Double stranded plasmid DNA, isolated from 
strain XLl-3lue, was sequenced using Sequenase^ (US 
Bicchemicais) according to the instructions supplied by the 

3 5 manufacturer. 

The translated peptide sequence for ail ZLISA 
positive colonies examined shared the consensus sequence shown, 
m Fig. 3. Tne preferred recognition sequence for the D32.39 
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antibody apparently covers a six amine arid region of the 
dyncrphin 3 peptide (RQFXW) . In the first position, arginine 
is invariant for ail of the ELI 5 A positive clones. No stronc 
bias was evident for residues in the second position. In the 
third position, however, five amino acids (phenylalanine, 
histidine, asparagine, tyrosine, and tryptophan, in order of 
frequency) account for 93% of the residues. Of these, the 
aromatic amino acids comprise 74 5 of this total. The fourth 
position shows a strong bias for the positively charged 
residues lysine (69%) and arginine (21%) . The fifth position 
is occupied almost exclusively by hydrophobic residues, most 
of which are valine (31%). Valine and threonine predominate 
in the sixth position (75%) , with serine and isoleucine 
accounting for most cf the remaining amino acids. 

Of the ZLISA negative clones obtained after panning, 
greater than half showed peptide sequence similarity to the 
consensus motif (Fig. 3). None cf 19 isolates sequenced from 
the unmanned library shewed any such similarity. Some of 
these ZLISA negative sequences differ enough from the 
consensus that their affinity for the antibody may be 
insufficient to permit detection in the ZLISA. There are, 
however, ZLISA negative sequences identical in the five 
conserved amino acids z : the consensus region to clones that 
scored positive (e.g., -23 and #57) . There may be amino acids 
outside the consensus region that affect binding of the 
peptide to antibody or its susceptibility to Z. call 
proteases, or its availability in the ZLISA. That even the 
ZLISA negative clones frequently have an obvious consensus 
sequence demonstrates the utility of the present invention for 
isolating ligands for biological receptors. 

Zxamole 5 

1 . Optimization of linkers for headpiece diner 

display 

To obtain headpiece diner polypeptides that bind to 
their encoding piasmids with sufficient stability to 
facilitate affinity purification, two headpiece domains were 
inserted in a construct adjoined by random linkers (Figs. 5 * 
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6a) . The vector (pDimerl) contained tvo repeated segments of 
the lac repressor gene, respectively encoding residues 1-49 
and 2-49 of the headpiece DNA binding domain. These two 
segments were linked by a sequence encoding a 4-5 random 
5 residue "headpiece linker" which, based on molecular modeling, 
night allow positioning cf the headpiece DNA binding domains 
for stable binding to IacC 3 sites present on the parent 
piasmid. Fused to the second headpiece domain was a sequence 
encoding a 4 random residue "display linker" designed to 

10 facilitate the C-terminai display of peptide ligands. To 
screen the initial "linker library", a 7 residue epitope 
(RQFXVVT) for the D32.39 monoclonal antibody (Barrett & 
Goldstein, Neuropeptides 5, 113-120 (1935)) was fused to the 
C-terminal display linker- To increase the chance of finding 

15 active headpiece dimers, the vector was designed to have two 
2ac0 3 sites. 

Headpiece diner "linker" library piasmid pDIMZRl was 
constructed as follows. 10 ng pMC5 (encoding lacl headpiece 
domain) as a template, primers ON-929 (TATTTGCACGGCGTCACACTT 
20 [SZQ ID NO: 79 ] ) and ON-930 (CCGCGCCTGGGCCCAGGGAATGTAATTGAGCTC- 

CGCCATCGCCGCTT [ SZQ ID NO : 3 Q ] ) were used in a (25 cycle) ?CR 
reaction to modify the ends of the region encoding the first 
4 9 residues of lacl to form headpiece About 1 ug cf the 

modified fragment encoding headpiece ^1 was digested with 

2 5 SamKI and Apal, gel purified, and inserted between the 3atHI 

and Apal sites of pMC5, replacing the lacl coding region, to 
form intermediate piasmid pMC5dlad . To construct "headpiece 
#2", PCR primers ON-93 3 ( CGATGGCGGAGCTCAATTACATTCCC- (NNX) 
AAA C C A G T AA CGTTAT A CG AT [ SZQ ID NO : 3 1 ] ) , ON-939 ( CGATGGCGGAGCTC- 
:: AATT A C ATT CCC- (NN1<) 4 - AAACCAGTAACGTTATACGAT [SZQ ID NO : 32 ] ) , and 
ON-94C ( CG C C CG CC AAG CTTAGGTT A CAACTTTGAACTG A CG- (MNN) ^-GGGAATGTA- 
ATTCAGCTCCGCCAT [SEQ ID NO:33]),' were used to attach sequences 
encoding a 4 or 5 rand on residue "headpiece" linker, a 4 
random residue "display" linker, and the D32.39 monoclonal 

3 5 antibody epitope (RQFXVVT ;SZQ ID NO: 66 1) (Barrett £ 

Goldstein, Neuropeptides 5, 113-120 (1935); Cull en ai . , Free . 
Natl. Acad. Sci . USA 39, 1355-1359 (1992)) to codons 2 through 
4 9 cf the lacl headoiece. Accroxir.atelv 1 pg of the end- 
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modified DMA fragment speeding headpiece ,r2 was digested w i th 
Sstl and Slndlll, gel purified and ligated into the SscI and 
EizidXlI sites of pMCSdlacT. Plasmids encoding f our-rar.dcm- 
residue headpiece linkers were combined with those encoding 
5 five-random-residue linkers at a ratio of approximately 10:1 
to make the pDIMZRl linker library. The pDIXZRl library was 
introduced into bacterial strain ARI 23 0 by electroporaticn to 
produce a library of 3 x 10 3 individual transf onnants . 

The headpiece dimer gene was expressed under the 

10 control of the araB promoter using three separate induction 
levels. The linker library was amplified in three 325 ml 
L3/Amp 1QQ medium (100 ^g/mi ampiciilin) pools containing, 13 
with no additives for basal "A" promoter induction, L3 with 
0.1% glucose followed by promoter induction with 0.2% 

15 L-arabinase for 30 min prior to harvest to give partial "5" 
induction levels, and L3 with 0.2% L-arabinose for 15 min 
prior to harvest for full "C" promoter induction. 

Upon cell lysis, the subset of these pi asm ids that 
displayed the D32.39 epitope was enriched relative to other 

20 plasmids in the population. Stable complexes were captured by 
panning the lysate in microliter wells coated with immobilized 
D32.39 antibody. Panning was carried cut in Immulon 4 
microliter wells (Dynatecr. Laboratories) coated with 2 -g per 
well D32.39 antibody as described in Example 5, except that 

25 KZX/1% 35A (35 nM HZPES (Research Organics Inc.), pH 7.5 with 
KOH, 0.1 ml! EDTA, 50 m>! XCi, 1% Bovine Serum Albumin, Fraction 
V) replaced KZKL/3SA as the primary incubation and wash 
buffer. In all rounds, 0.1 to 0.2 mg/mi sonicated herring DNA 
was included in the incubation buffer as a nonspecific D N" A 

: : competitor. In rounds three ana four of panning, 5 to 
10 ^g/ml of self-annealed ON-413, a lac0 3 containing 
oligonucleotide (GAA TTC AAT TGT GAG CGC TCA CAA TTG AAT TC 
[SEQ ID NO: 34]) was included in the incubation buffer as a 
competitor . Following a one hour incubation at 4°C, unbound 

3 5 headpiece diner complexes were washed from the wells four 
times with HEX/35A followed by two washes with HEX. Bound 
plasmids were extracted from the wells using 50 ^1/well 
TE/NaCl buffer (10 ml! Tris-HCi, (pH 7.5) / 1 mM EDTA / C.5 M 
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NaCl) mixed with 50 jil/vell phenol. After addition of 1 ^1 
glycogen carrier (20 ng/r;l, Soehringer Mannheim) , the 
recovered plasmids were precipitated with an equal volume of 
iscpropanol, followed by a 10 minute spin at 14,000 rpm in a 

5 microfuge. Plasmids were resuspended in 4 u-i water and used 
to transform bacterial strain ARI 230 for recovery counts and 
further rounds of panning. After two rounds of panning, 
enrichment numbers indicated that the pools grown under 
conditions of "3" (partial), and "C" (full) promoter 

0 induction, gave the best enrichment. Based on this finding, 
only the 3 and C pools were used in rounds 3 and 4. 

Sequencing cf indi vidua 1 clones selected after f cur 
rounds of panning revealed the primary structure of their 
linkers. Of 22 clones that yielded readable sequence, 5 

5 contained frameshifts cr step codons which would prevent 
translation cf the D32.39 epitope. Two 3 pool clones, 
isolates B7 and 310, were present as duplicates, indicating 
selective enrichment by the panning procedure from less than 
one in 10 s to rr.ore than one in six. Surprisingly, one cf the 

0 enriched clones, isolate 310, and one C pool clone, C5 , had 

frameshifts early in the second headpiece domain with a second 
frameshift late in the headpiece coding sequence that restored 
the reading frame of the 332 . 39 epitope (rig. 6b). 

Tc assess which clones encoded the most stable DNA 

5 binding proteins while displaying the epitope in the most 
favorable way, the clones were individually evaluated in a 
panning experiment. Each clone having the D32.39 epitope in 
the correct frame was examined together with one clone having 
a frameshifted epitope as a negative control. An intact iacl 

: construct (see Example 3) served as a positive control. Each 
clone was panned against E32.39 Ab and also against MAb344 as 
a negative control. Specific enrichment was evaluated by 
transformation cf Z. cell with recovered plasmids. 

After four rounds of panning, individual clones were 

5 grown in L3/Amp 1 0 0 / 0 . 1 % glucose for two hours at 37°C. 

Following addition cf 1-arabinose to C.2% (3 induction), 
cultures were grown for an additional 30 mm, then chilled on 
ice for harvest. 1 ml of each culture was microfuged for 
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2 mi- at 14,000 rpm. The cells were washed with 0.5 ml ice 
cold WTEK buffer (50 nM Tris, (pH 7.5), 10 ml-l EDTA, ICO mM 
XC1) , cervtrifuged for 2 min, washed with 0.25 ml cold TEX 
buffer(10 aiM Tris, (pH 7.5) / 0.1 aM ZDTA / 100 mM KCi) , 
centrifuged, then resuspended in 100 jil cold HEX buffer. To 
each resuspended cell culture, 0.9 ml lysis buffer (35 m>: 
HEPES, (pH 7.5 with KOH) , 0.1 ulM EDTA, 5% Glycerol, 1 mM DTT , 
0.1 mM pMS? (phenylmethylsulfonyl fluoride), o.i mg/ml as A) 
was added. Cell were lysed by adding 20 ^1 of io mg/al 
lysozyzme (3oehringer Mannheim) to each tube followed by 
incubation on ice for 1 hr. The lysed ceil cultures were then 
microfuged at 14,000 rem for 10 nin at 4'C, and the 
supernatant transferred to a new tube. 

To evaluate each clone, 10 ^1 of clear lysate was 
added to methacrylate beads (Affi-prep 10 support, Bio Rad) 
coated with the D32.34 monoclonal antibody, or negative 
control XAb344, suspended in 0.5 mi HIX/33A/0 . 01 mg/mi herring 
DNA. After incubation at 4'C cn a tube rotator for one hour, 
beads were washed twice with cold HEX/ S3 A and twice with HEX 
over a 50 min period. The remaining antibody-bound plasmid 
complexes were recovered from the beads by phenol extraction 
and isoprccanol precipitation. Enrichment was defined as the 
number of transforming units of plasmid recovered panning 
against the D32.39 antibody beads divided by the number 
recovered panning against the MAb344 control antibody. 

The individual evaluations revealed relatively few 
clones that yielded greater recovery with D32.32 Ab compared 
to the negative control. Only four isolates (37, BIO, C4 , C5 
in Fig. 6) showed enrichment greater than two fold. The best 
clones were 37 and B10, the same isolates that represented a 
large fraction of the round four population. These isolates 
yielded enrichment of 3 and 23 fold respectively. Of the four 
clones showing specific enrichment, three contained cysteine 
residues in their headpiece spanning linkers and all four had 
a proline residue in their display linkers suggesting that 
seme degree of activity might be conferred by these residues. 
Surprisingly good enrichment was achieved by isolates 310 and 
C5, which contain reading frame shifts in the region encoding 
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the second head.iece resuming in an entirely different amino 
acid sequence from that er.ccded in the first headpiece domain 
(Fig. 6). Overall, the headpiece dinner clones performed less 
well than the lad system described in Example 3. 

2 . TcnlaHer o * Mutant He»dni see Diners 
To increase headpiece dimer/DNA complex stability 
and therebv increase the panning performance of individual 
clones, random mutations were introduced in the regions 
e-ccd^ng the headpiece dimer and adjoining linkers. Seme 
mutations in lad resulting in tighter-binding mutants have 
been reported (Betz 6 Sadler, J ■ Hoi. 3iol . 105, 293-319 
(197 fi,; Klein. eMill.r, J. Sol . Blal . 212, 2*5-313 (99); 
Xolkhof, SUC1. Adds. ,es. 20, 5035-5039 (1992); Maunzo. > 
G „ be .- w , FI3S L*Zt. 239(1), 105-103 (1933); Miller, The Opera, 
(Miller * Reznikoff, eds.]. PP- ^ (1930). Cold Spring 
Harbcr Laboratory, Cold Spring Harbor, NY). 

^ starting copulation for mutagenesis was tne 
.eadoiece d^er 3 pool, obtained after four rounds of random 

• c-,-t ; ^c with iOIMZHI 3 pool 

lin 1 ^- librarv panning. S^a-> 5 - 

"smlds, isolated after four rounds of affinity purificatio 
as a t^ate. flanking -=.r. .ere used in an adaptation o, 

-,.™- ;c « CT (Gram et si.. ?roc . .Vati . Acai. Sci . ^ o., 

,_.._c .t-.-t i i, H-15 (1939)) to 

2576-3530 (1992); Leung e_ a. . , --C...^-- 

cenerate mutations within the headpiece dimer and ^« 

• a -oiv ? -a o-" nutated DNA rraoen.s 
coding sequence. Approxma .elv 2 „g o. 

^~ a .= «-o^ us ; ^c Nhel and iUndlli a..c 
c-e-ated bv ?CR was digest u=,-..y 

- , * h fa single IacO s -ccntaining vector), 

cloned into plasmid pJ*^o (a sing s ; rd „^i on 

^ nl ^-or ^n ARI 230 using A and 3 md^^on 
The ligation was ampli--ec 



conditions, to lover the total amount of ^^T^ry 
protein in the cells, as described above to produce 
of 1.6 x 103 individual transforms, stable headp 



'in' the cells, as described above to produce a 

rmants. Stable headpi« 
i.cted from this population by 
dimer/plasmia co;i>ua«*« /^fi-oreo 10 



suppor,, 3io R,d, coa-.ed wit!, the D„.„ 

,lu-=n -ro= b«ds was c.rri.d =u, for =r.e hour a. 4 - J 
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peptide containing the RQ~-CVy T [SSQ ID NO: 55] epitope in all 
four rounds. 

To determine whether any specific mutations had been 
selected through four rounds of panning and amplification, the 
regions encoding the headpiece DNA binding domain and 
adjoining linkers were sequenced from individual clones. Six 
of eight A pool clones and one B pool clone had a specific Q 
to R mutation at position 18 within one or both headpiece 
domains (Fig. 6). Significantly this Q13 to R mutation falls 
in a portion of the headpiece DNA binding helix that is 
critical for the recognition of operator DNA by iacl (3aelens 
ez al., J\ *foJ . Biol. 193, 213-216 (1987); Chuprina et ai . , 
1993; Ebright, ?roc. Natl. Acad. Sci . USA 33, 303-307 (1936); 
Lamerichs at al . , Biochemistry 23 , 2935-2991 (1989); Lehming 
et al., ZM30 J., 9, 615-621 (1990); Lehming e: ai . , rtfSC J. 6, 
3145-3152 (1937); Lehming et al . , Proc. }lazl . Acad. Sci. USA 
35, 7947-7951 (1933); Sartcrius e: al . , ZX3C J. 3, 1255-1270 
(1939)). Many other mutations were present throughout the 
headpiece dinar and linger coding regions. 

To evaluate individual mutant headpiece dimer 
clones, single clones were analyzed for enrichment under the 
conditions described above. Non-mutant headpiece dimer 37, a 
iacl clone, and an out-o: -frame headpiece dimer clone were 
used as centrals. The headpiece linkers cr.es en to display the 
mutant headpiece clone had each been identified in more than 
clonal isolate in the linker optimization screening. The 
promoter induction conditions for these tests were A and 3 
reflecting the conditions used for the initial selection of 
the mutants. 
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TABLE 2 




i Enriciunent ; 


Isolate 


Description 


Exp. #1 


Exp. #2 


Exp. #3 ; 


ARI 192 


vt. lad 
f+CT) 






155 


32.2 


Framshift 
(N.C.) 


0.44 


0 .77 


0.44 


B7 


wt . KdD . 


0.64 


1.24 


10.0 


A4 .2 


mutant KpD. 


6.2 


3.3 


61.0 


A4.5 


mutant HpD. 


360 


377 


1017 


A4 .7 


mutant HpD. 


11.6 






A4 .3 


mutant HpD. 


95.0 | 11.6 


70 . 0 


34 . 5 


mutant HpD . 


365 


116 


350 


34 .7 


mutant HoD. 


333 


3 . 5 


13 . 0 


34 . 3 


mutant HpD . 


i 

— | 14.6 


57. 0 



Experiments a.-.d i2 were carried out using basal (A) 
promoter induction conditions, experiment *3 was carried out 
using oarziai (3) promoter induction. 

Table 2 shows greater enrichment vas obtained from 
ail of the selected autants than the wild-type headpiece diir.er 
B7. Of the nine mutants tested, two isolates from separate 
pools, numbered A4 . 5 and 34.5, showed the greatest enrichment 
under both basal and partial promoter induction conditions. 
These two clones share the same Q18 to R mutation in their 
second headpiece DMA recognition helices suggesting that this 
mutation might be important for DMA binding. These clones 
also conrain the GRCR headpiece linker found in the B7 
isolate, although mutant A4 . 5 contains a display linker that 
is different than the one shared by B7 and nu.ant B4 . 5 
(Fig- 6). 

Expression levels of several mutant headpiece dimer 
^retains were analyzed in whole ceil iysares on SDS/PAGZ to 
determine whether increased enrichment was due to increased 
expression, levels. Staining of proteins from cells grown 



W0 96/40987 PCT/US96/09809 

62 

under conditions of A cr 3 induction shewed little difference 
in 14.5 kD headpiece diner polypeptide expression betveen 
mutants A4 . 5 and 34.5 as compared to the non-mutant 37. 
Western blot analysis of these clones using the D32.39 
5 antibody shoved similar levels cf expression, indicating that 
levels of enrichment fcr individual mutants were probably due 
to structure and not expression levels. 

3 . Screenir.c a Random Library Using Optimized 

0 Headpieces and Linkers 

The library was constructed in plasmid pCMG14 which 
contains headpiece diner mutant A4 . 5 under the control of the 
ara3 promoter. A series cf restriction sites at the 3' end of 
the gene facilitate cloning cf synthetic oligonucleotides, 

5 allowing fusion of the headpiece dimer display linger to a 

random peptide. Each member of the random library consists of 
a pep t ide-d isp laying headpiece dimer bound to its encoding 
plasmid . 

A random dedeoamer library comprising 10 9 
0 oligonucleotide members was inserted into pCMG14. As a 

control, a iacl-based peptides-cn-plasmids library of similar 
size using the same random library oligonucleotides was 
constructed in parallel. Identical bacterial strains, panning 
conditions, and basal promoter induction was used for both 
5 libraries. Libraries were panned in microtiter wells coated 
with D32.39 antibody cr the same amount of MAb344, as a 
negative control. Recovery of piasmids during panning yielded 
enrichment for both libraries. 3y round 4, the headpiece 
dimer library showed 1355 fold enrichment over the negative 
control while the lacl library yielded 1150 fold enrichment. 
Fig. 3 shows that isolates picked from both libraries encoded 
peptide structures similar to the D32.39 antibody epitope 
(RQFXYVT fSEQ 10 NO:56;}. The enrichment and sequencing 
results show that the headpiece dimer system selects peptide 
5 sequences that bind specifically to a receptor. 

To verify receptor specificity and to determine the 
relative affinity of the headpiece dimer versus the lacl- 
library derived peptide sequences obtained through panning, 
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the peptide-encoding sequences from each fourth round library 
pool were transferred into a vector such that the peptides 
would be fused in frame with the maltose binding protein (MB?) 
(Bedouelle « Dupiay, Evr . J. Siache^i. 171, 541-549 (1933) ; 
5 Dupiay at al . , J. Biol. Chen. 259 , 10605-10S13 (1984); Guar. 

et al., Gene 67, 21-30 (1933); Maina et al . , Ger.e 74 , 355-373 
(1933))- This transfer permitted comparison of the headpiece 
dimer and lad derived peptides fused in an identical fashion 
to the same carrier protein. 
10 Under the conditions employed, M3? exists primarily 

as a monomer (Blondel & 3edouelle, Proc. Engineer 4 , 457-451 
(1991); Kellerman i Ferenci, Heth . Zr.zyziol. 90, 459-463 
(1932); Richarme, 3icchen. 3iophys . Rss . Comm. 105, 476-431 
(1932); Richarme, 3iocha~. 3iophys. Ada. 743, 99-103 (1S33)) 
15 and thus -he ME?-peptide fusions would be expected to bind to 
receptor monovaiently . This non-cooperative interaction 
should allow a good correlation between affinity of the 
peptide for the receptor and the level of receptor occupancy 
during binding and washing steps. The intensity of the ELISA 
20 signal is expected to correlate approximately with peptide 

affinitv. Evidence supporting this hypothesis was obtained by 
comparing the ELISA signal strength produced by MB? fused to 
different epitopes of known affinity. rig. * demonstrates 
that M3? fused to epitopes with affinities of 340 n.M (?CMG33) 
25 and 0.51 n>! (?CXG39) produced dramatically different ELISA 

signals. Using other peptide ligand families, the intensity 
of the signal in the MS? ELISA correlates approximately with 
the affinity of the ligand for a receptor. 

Lvsates of 23 randomly picked isolates from each 
30 librarv oool were tested in ELISAs with the D32-39 antibody 

test receotcr, or MAb344 and BSA as controls. Clones without 
the correct insert DNA structure determined by sequencing were 
excluded from subsequent analysis. Fig. 9 shows that of 19 
clones from the headpieoe dimer library pool included in the 
35 analvsis, 13 shewed M3? ELISA signals greater than 0.5. Or 21 
isolates from the lad library, however, only 2 yielded 
signals of 0.5 or greater. A comparison of the two data sets 
using an unpaired t test shewed the difference was significant 
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indicates that the headpiece 
enrich liga.nds with higher 
by the multivalent lad-based 



4. Headniece Diner DN'A Bir.dir.c Studies 
The selection system by which headpiece diners vere 
selected demanded two things. First, that the protein bind to 
the piasmid that encoded it with acceptably high stability. 
Second, that the plasmid-pratein complex display a peptide in 
such a way that it was available for binding to an immobilized 
receptor. Many mutations were present in the pool of selected 
headpiece dimers including the Q13 to R mutation, at a 
position known to be critical for IacQ 3 sequence recognition 
in lacl (Learning at al . , Z>130 J. 9, 615-52 1 ( 1990)}. This 
experiment investigates whether some of the headpiece dimer 
mutants night employ CNA binding sites ether than IacO s< 

To compare the mechanism of CNA binding between the 
mutant headpiece dimer A4 . 5 and lacl , two pairs of plasmids 
were constructed, one pair with, and the other pair without, 
iacO s binding sites. Ran ova 1 of ! acO s sites from the plasmids 
was carried cut by replacing the :;hel to AdvN*: fragment with a 
similar fragment that lacked iac0 3 . Cne member of each pair 
displayed the C32.33 epitope linked to chloramphenicol 
resistance, and the other carried ampiciiim resistance but 
lacked the epitope. Starting with ceils mixed in the ratio of 
"1 Cam r cell to 1000 Amp r cells, iysates were panned against 
the 032.39 antibody and the control MAb344. Plasmids 
recovered from the antibody coated wells were transformed into 
5. coil and plated on Amp (100 ug/ni) and Can (20 ^g/ml) 
plates for the determination of Amp/Cam piasmid ratios as a 
measure of enrichment. Enrichment was defined as the starting 
Amp/Cam piasmid ratio divided by the final (panning derived) 
Amp /Can ratio. 

As expected, in three separate experiments, deletion 
of iaoC, sites resulted in an average 445 fold enrichment drop 
to near background level for the lacl peptides-on-plasmids 
construct. For headpiece dimer mutant A4 . 5 constructs, 
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however/ deletion of the pi asm id 1 3. cO s site had no significant 
effect on enrichment. An average 43 fold enrichment was 
obtained by the headpiece dimer mutant A4 . 5 constructs with 
Iac0 3 and an average 44 fold was achieved -without: 2ac0 3 . This 
5 finding suggests that the headpiece dimar mutant A4 . 5 does not 
require binding to lacO s as a mechanism of linkage to its 
parent plasmid. This is consistent with observations made on 
mutants of full length lacl which, upon substitution at 
position 13, lose lacQ site binding specificity (Kleina & 

10 Miller, J" . Mol. Biol. 212, 295-313 (1990); Lehming et al . , 
EX30 J. 9, 615-521 (1990); Lehming et al . , EH30 J. S, 
3145-3153 (1987) ) . 

To determine the plasmid binding site(s) of 
headpiece dmer mutant A4.5 and the non-mutant B7, several 

15 prctein-DN'A binding experiments were performed. Preliminary 
gel shift experiments with plasmid pCMG14 digested into small 
fragments combined with headpiece dimer A4 . 5 polypeptide, 
resulted in no visible shift for any of the plasmid fragments. 
Other experiments using j: ?-labeled plasmid fragments 

20 ccmplexed with over-expressed headpiece dimer A4 . 5 , 37, and 

fuli-lengtn lacl polypeptides, showed specific binding of lacl 
to i3c0 3 -containing fragments, but failed to show any specific 
binding by the mutant (A4.5) and non-mutant (37) headpiece 
dinars. Another experiment, in the absence of unlabelled 

25 herring DN T A competitor, showed nonspecific binding to all of 
the plasmid fragments by lacl and both headpiece dimer 
isolates indicating that some degree of nonspecific DN'A 
binding occurs for headpiece diners and lacl alike. 

These result indicate that the mutant headpiece 

3 0 diner, while offering improved performance over the lacl 

plasmid, surprisingly does not require a lacO binding site for 
use in the panning procedure. The in vitro binding data 
suggest that the mutant headpiece dimer may not show a strong 
preference for a specific sequence in the vector encoding the 

35 headpiece dimer. 

Although the mechanism and degree of sequence 
specificity, if any, by which the mutant headpiece binds to 
DS'A are unclear, the power of the above methodology to self- 
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select opt.ir.aI DMA binding proteins for use in screening 
peptides is evident. The method has self -selected a 
derivative DNA binding protein that has substantially 
different binding characteristics than the natural lacl 
protein, but which offers improved enrichment compared with 
the lacl parent protein in selection of peptides having high 
affinity for a receptor. 



Example 5 

10 Standard Protocol 

This Example provides a standard protocol for the 
method of the present invention with any receptor that can be 
immobilized on a microtiter dish with an immobilizing 
antibody . 



(1) Raacents 
To practice the method, 
be helpful. 



the following reag = 
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35A , fraction V, HI A grade 
3SA, protease free 

Bulk DNA , sonicated, phenol extracted 
Centriprep 130 concentrator, 5-15 ml 
Chromatography column, C-2 2X2 5 0 
Cocmassie Plus protein assay reagent 
DTT 

ECTA , disodium, dihydrate 
Ethyl alcohol, 200 prcc: 
Glycerol 

Glycogen, molecular biology grade 
HEPES free acid 
IsopropancI , H?LC grade 
I?TG 

a-Lactose, mcnohydrate 
Lysozyme, from hen egg white 
Microtiter plate, Immuicn 4, fiat hot: 
PBS 
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PMSr 

Phenol, equlibrated 
Phenol:chlorofam: Isoanyl alcchol 
Potassiun hydroxide solution, 3.0 N' 
Potassium chloride 
Sodium chloride 

Sephacryl S-400, high resolution 
Tubes v/ screw cap, 13 al 

The various buffers and other preparations referred 
to in the protocol are shown below. 

HE buffer is prepared at pH =7.5 by adding 3.34 g of 
HSPES, free acid (use a better grade than Signa's standard; 
the final concentration is 35 mMJ , to 200 M l of 0.5 M EETA , 
pH 3.0 (final concentration is 0.1 hlM) and adding water to a 
final volume of 1 L. The pH is adjusted with KOK. 

HEX buffer is identical to HE buffer but also 
contains XCl at a final concentration of 50 nM. 

HEXL buffer is identical to HEX buffer but also 
contains alpha-iacrose, which aay require warding to go into 
solution, at a final concentration of 0.2 M . 

Lysis buffer (5 nl) is prepared by nixing 4.2 nl of 
HE buffer with 1 nl cf 50* glycerol, 750 al of protease free 
3SA at 10 ng/nl in PBS , 10 <:1 of 0.5 >! 3TT , and 12.5 ^1 cf 
0.1 M pMS? in isoprcpanoi. 

KZX/3SA buffer is prepared by dissolving 5 g cf 15 
BSA, fraction v, in 500 rl of HEX buffer. 

WTEX buffer is prepared at pH = 7 . 5 by adding 7.53 g 
or Tris , pH = 7.5 (final concentration cf 50 nuM) , to 20 ail of 
0.5 M EDTA (final concentration cf 10 rr.M) and 7.45 g of KCl 
(final concentration of 100 nM) and adding water to a final 
volume of 1 L. 

TEK buffer is prepared at pH = 7.5 by adding 1.51 g 
cf —is, pH = 7.5 (final concentration of 10 hlM) , to 200 p.1 of 
C5 M EDTA (final concentration of 0.1 and 7.4 5 g of KCl 

(final concentration cf 100 rraM) and adding water to a final 
volume of 1 L. 
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COli 435 «iAl hsd*17 A^nT-x-C) 

fc random dcdecaaer panning. *. coii 31 , A _ v 
endAl nu P 0 Ion-1! s.lAl hs^X7 ,(c,pT-,e,C) Ac2oA3^. ^I, 

and to isolate DNA for sequencing. 

The various stations in strain ARI 814 are designed to 
enhance various aspect of panning as described below I* was 
constructed in li steps starting with an E. „Ii 3 stra- V" 
the s coli G enetic Stcc* Center at Vale University ( ,."^ 
B/-, s.ock center designation CGSC6573) with genotype ^-n 
sulAl This strain was chosen as a starting point becaus'a of 

robust growth properties and because - yiald3 excs ^ n . 
e-ectroconpetent cells, which are essential for construction" 
- .arge libraries and for the maintenance of clone diversity 
during panning. In spite of considerable oen = - c 

!::;:r lacioa ' zna str ^ Minsaia - d *~ - : »-»«. ^ and 

W * an " 0raa "-" B trough ,:,e construction orocass. 

The strain contains the hsdRil aii a i a Sroa s _^ n 
HC1051 which prevents restriction of un,cdifi ad DMA introduced 
^ translation or transduction. This nutation heios 
— tain library diversity and simplified f, r , har con ; tr , criOR 

The onpT-fecC deletion frcn strain UT5600 removes rh- 
gene encoding the cpT protease, which digests oeotides 
between paired basic residues. This protease is ext-^y 
active m cell lysates and would potentials have been a „ a jo- 
Haitation on the diversity of peptides in a randon library 
The ion-li and clcA stations also linit proteolysis because 
-ev prevent expression of AT?-deper.dent , cvtoolasmic 
proteases. The suIAl allele suppresses a deleterious 
fomentation phenotype often caused by Ion nutations. 

ARI 314 also contains a deletion of the iacl gene to 
prevent expression of wild-type lac repressor, which would 
compete with the fusion constructs for binding to the iacO 
sites on the plasnid. The lacZ nutation prevents waste of the 
cell's netabolic resources making 3-galactosidase due to 
absence of the repressor. The e.tdai nutation knocks ou- 
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expression cf a nuclease that has two deleterious effects on 
panning. First, it could digest plasmids in the crude cell 
lysate used for panning, reducing the number of recoverable 
complexes. Second, it levers the quality of DNA preparations 
used for cloning or sequencing. Finally, the A?. I 814 strain 
contains a recA deletion to prevent multimerization of 
plasmids through recA-catalyzed homologous recombination. 

ARI 814 is prepared for use in electroporation 
essentially as described by Dower, supra, except that 10% 
glycerol is used for all wash steps. The cells are tested for 
efficiency using 1 pg of a p31uescript plasmid (Stratagene) . 
Ceils routinely yield transformation frequencies of 2 x 10 1Q 
colonies per pg of DNA . These cells are used for growth cf 
the original library and for amplification of the enriched 
population after each round of panning. 

(3) Library construction 

The interrupted palindrome SfiZ sites in pJS142 allow 
efficient cloning of library oligos because they greatly 
minimize undesired legation events. Only the correct 
orientation cf the annealed library oligos can iigate 
efficiently into the vector. In addition, once the SliZ 
digested vector is purified away from the small internal 
"stuffer" fragment, the vector ends cannot legate to each 
ether because of incompatible sticky ends. Libraries 
routinely have greater than 10 3 independent clones per pg of 
vector used in the ligation. 

Vector fragment for library construction can be 
purified from the stuffer fragment by either of two methods. 
For small scale (5-10 pg) library construction, pJS14 2 is 
digested with Sri I and then with EagI (to reduce background) 
and electrophoresed on an agarose gel. The vector fragment 
can be eluted from the gel using the Geneciean kit (3io 101) . 
For larger scale preparations, potassium acetate gradients are 
used to purify vector fragment. 

a. Procedure rcr Purification of Vector for Library 
Construction 
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1. Digest 200 of pJS142 D.\"A to completion in 
1 ml final volume with S'il followed by £acl . 

2. In a 1/2" x 2" ultraclear centrifuge tube, 
carefully layer 5%, 10%, 15%, and 20% potassium acetate' 
solutions containing 1 nM ZDTA and 2 ^g/ml ethidium bromide, 
using 1 ml of each. 

3. Layer 1 ml of the digest on top of the gradient. 
Centrifuge at 43,000 rpm for 3 hrs in a Beckman SW50.1 rotor. 
The large vector fragment will migrate to a position -2/3 of 
the distance from the top of the gradient as visualized with a 
long wave uv source. The small staffer fragment remains at 
the top of the gradient while undigested supercoiled DNA forms 
a pellet on the bottom of the tube. 

4. Puncture the tube with an IS g syringe needle 
15 attached to a 3 ml syringe and extract the fragment (-0.5 to 

1.0 ml) . 

5. Remove the ethidium by extracting five times 
with an equal volume c: water saturated 1-butancl. 

6. Transfer to a microfuge tube, add l/io volume 
3 M NaCl, and then an equal volume of isopropanol. Centrifuge 
at top speed for 10 min, pcur off the liquid, and wash once 
with 3C% ethanoi. 

7. Resusper.d the pellet in water cr TZ and 
cetermir.e the ccncentrat ion by reading A 25: . The yield from 
the gradient is usually about 40% of tne input amount. 

b - Procedure for Library Construction 
Three oiigos are needed for library construction, 
O.V-329 (5' ACC ACC TCC GG] , ON-830 (5* TTA CTT AGT TA) , and a 
library specific oligo of sequence (5* GA GGT GGT {NNK}„ TAA 
CTA AGT AAA GC [SEQ ID NOS:35 and 36]), where {NNK} n denotes a 
random region of the desired length and sequence. The oiigos 
can be 5 ' -phosphory lated chemically during synthesis or after 
purification with polynucleotide kinase. They are then 
annealed at a 1:1:1 molar ratio and ligated to the vector. 
N'ote that the melting temperature of the annealed oligo 
complex is quite low, so the final annealed mixture should 
never be warmed above the 14 3 C ligation temperature. 
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1. Mix phosphorylated ON-S29, CN-330, and the 
library oligo (50 pm each), 1 al 5 M NaCl, 2.5 m! I M Tris, 
p H 7.4, and ■ dH 2 0 to bring the tctal volume to 50 /il. 

2. Hsat to 70 9 C for 5 min in a tamp block and then 
turn off the block and allow the mixture to cool slowly to 
around 30»C. Move the whole temp block into a 4 • room or 
refrigerator and allow it to cool to below 10=, then move the 

samples onto ice. 

3. Mix on ice: 5 M g d-3 picomole) pJS142 fragment, 
2.6 MK2.6 oiconole) annealing mix, 25 ul 10x ligase buffer, 
dH 2 0 to 250 Ml, nix, then add 2 M l (300 NEE cohesive end 
units) T4 ligase. In parallel, set up a 1/10 scale no cligo 
control to check for background. Incubate at 14 °C for 

12-2 4 hours. 

4. Heat to S5°C, 10 min to inactivate the ligase. 
Add 2 jil 25 nM DMTP mixture (Pharmacia), 1 Ml (13 units) 
Secuenasa 2.0 (US Biochemical*) . Add 1/1C amounts to the 
control legation determine ligation, efficiency compared to the 

control . 

5. Add 250 m! H-0, 55 pi 5 M NaCl to the library, 
-xtract with 300 Ml phenol/CKCl3, spin 3 min, and move 500 m! 
of the acueous phase to a new micrcfugs tube. 

\ Add 1 Mi 2 0 ng/mi glycogen (Boehringer Mannheim 
.ciecuiar biology grade) and SCO ,1 isoprccanol. Mix well and 
som m r.icrofuge at to? speed for 10 mm. 

7 oour off the liquid, close the tube, and spin 
b-i-f'v. Use a fine bore pipet tip to remove the last traces 

liquid without disturbing the pellet. Wash the pellet with 
500 u l of 4' 30% ethanol, spin 2 min. Four off the liquid, 
close the tube, and spin briefly. Use a fine bore pipet tip 
to remove the last traces of liquid. This careful washing 
procedure is important to remove all traces of salt to preven. 
problems during the electropcration step. 
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8. Rssuspend the pellet in 
until ready for amplification. 



1 dH 2 0- Store at 
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( 4 ) Screenincr 

The library car. be screened over a two-day oeriod as 

follows. 

Cav l 

1. Coat two sets of 12 aicrctiter wells with dis- 
appropriate amount of immobilizing antibody in 100 M l oPpbs 
for panning and negative control; let the coated plate 
incubate at 3 7-c for 1 hr. Consider using all 24" wells as 
"plus receptor- wells in the first round, i.e., RO negative 
control in the first round. 

2. Wash the plate four times (4x) with HEK/3SA. 

3. Block wells by adding 200 ^1 of HZX/3SA tc each 
well; let the plate incubate at 37 °C for 1 hr. 

4. Wash the plate 4x with HEX/ 35A . 

5. Dilute tha rsceptcr preparation, in cold HZX/3SA 
(or appropriate binding buffer] as necessary. 

5. Add the diluted receptor preparation to the 
wells at ICO ^1 per wail; let tha plate incubate at 4*C for 
1 hr. with agitation. 

7. Wash the plate 2x with cold KZX/3SA. 
3. Add 100 ?1 of 0.1 r.g/a: bulk 2 MA in KSX/3SA to 
sac- veil; incubate the plate at 4'C for at least 10 ni.-.utes. 

On day l, steps A-0 can also be carried cut. Note 
that the column separation (steps A and H-0 is optional). If 
the colur.n separation is omitted, the iysates from step G are 
added directly to the wells. 

A. 3egin equilibrating column 22 an diameter x 
22 cm height cf Sephacryl S-400) with cold HZXL ("lhr, flew 
rate is set tc collect 5 nil fractions every 2 to 3 minutes) . 

3. Prepare 1 mi of iysozy^e at 10 mg/ml in cold HE. 

C. Thaw and combine sub-libraries (2 31 1 total 
volume) in a 13 ml Sarstedt screw cap tube. 

D. Add 6 ml cf lysis buffer and 150 ^1 lysozyme 
solution (3oehringer lysozyme is preferred over Sigma 
lysozyme); nix by inverting gently; and incubate on ice for 
5 minutes, although less ~ime is often satisfactory. 
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~- Add 2 mi of 20* lactose (lad libraries cr.lv) 
and 250 pi of 2 M KC1 (200 ^1, for headpiece dimer libraries) 
and nix by inverting gently. 

Spin at 14.5 K for 15 minutes. 
G. Transfer supernatant by pipetting into a new 



tube, 
fractions . 



Load rav lysate onto the equilibrated colu: 
After lysate is loaded, collect ten 5 ml 



•I. Perform the coomassie protein assay as follows: 
(1) to 10 microtiter wells, add 100 M l of coomassie reagent 
and 20 A l from each fraction, and mix; (2) select 4 
consecutive fractions which correspond to i brown and 3 blue 
wells from the assay (light blue counts as blue). 

X. Combine selected fractions in a CentripreoiG 0 . 
Twc centripreps may be used to speed up the process. The 
maximum capacity of each centriprep is about 15 ml. 

I. Spin in J-63 centrifuge at 1500 rpm. 

M. Rinse the column with cold HEX for 1 hr . 

N. Empty liquid from the inner chamber every 
15 minutes until final volume < 2 ml ("1 nr.). 

0. Determine lysate volume, and remove 1% as "Pre" 
sample; keep Pre sample on ice. 

Returning to "he numbered steps, one proceeds as 

follows . 

Wash plate 2x with cold HZX/3SA. 
19 . 3ring the volume of the concentrated lysate up 
to 2400 llI by adding KZXL/3SA; add bulk DNA to a final 
concentration of 0.1 mg/mi. The activity of the receptor in 
this buffer should be verified. 

II. Add lysate at 100 pi per well; incubate the 
plate at 4°C for 1 hr. with agitation. 

12. Wash plate 4x with cold HEKL/3SA. 

13. Add 100 pi of 0.1 mg/ml bulk DNA in HEKL/3SA to 
each well; incubate at 4°C for 30 minutes with agitation. 

14. Wash plate 4x with cold HZ XL . 

15. Quickly wash plate ix with cold HEX . 
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16. Eiute by adding to each veil 5C pi 10 rtM Ins 
pH 3, 1 rtM SDTA, 0.5 M NaCi , then add 50 pi equilbratec 
phenol, ar.d agitate for 5 min. 

17. Remove ail eiuants; centrifuge to separate 
5 phases,, remove acquecus phase to a new tube. 

13. Add one-tenth volume cf 5 M NaCl and 1 pi cf 
20 mg/ml glycogen as carrier. 

19. Precipitate piasmids in equal volune cf 
iscpropanol at room temperature. 
10 20. Spin 10 minutes; carefully remove supernatant , 

spin again, and remove remaining supernatant. 

21. Wash with 200 pi of cold 70% EtOH . 

22. Spin and remove traces of supernatant as above. 

23. Resuspend plasmids in water (suggested volumes: 
15 100 pi for Pre; and 4 pi each for the panning and negative 

control wells; use more than 4 pi for panning and negative 
control samples in later rounds to retain as backups) . 

20 24. Chili 4 sterile 0.2 cm electrode gap cuvettes on 

ice. The panning sample is divided equally into 2 cuvettes to 
prevent complete loss of sample during electr opora t ion . 

25. To three 15 ml sterile culture tubes, add l mi 
SOC r.ediur. ( 2 % 3acto-Tryptone , 0.5% 3act:-yeas: extract, 10 mM 

25 N*a2i, 2.5 m>! KCi, 10 n.M MgCl 2 , 10 m>I MgS0 4 , and 20 nM Glucose) 
to two tubes and 2 ml tc one tube. Label the two 1 ml tubes 
as "Pre" and "NC" (for "negative control"), and label the 2 ml 
tube as "Pan" (for "panning") . 

25. Thaw 200 pi cf high efficiency electro-competent 

33 ceils. 

27. Transfer A 0 ul aliquots of cells to 4 chilled 
sterile eppendorf tubes; incubate the tubes on ice. 

23. Add 2 pi of each piasmid to each tube and mix 

gently . 

35 29. Transfer ceils/plasmids mixtures into their 

corresponding cuvettes; keep the cuvettes on ice. 

30. Set the Gene Pulser apparatus to 2.5 kV, 25 p? 
capacity, and set the Pulser Controller unit to 200 chms . 



10 



W0 96/40987 PCT/USW09809 

75 

31. Apply one pulse (time constant = 4-5 msec) . 

32. Immediately add the room temperature SOC medium 
to resuspend cells in the cuvette, 

33. Transfer cell suspension back to the culture 

tube. 

34. Incubate the culture tube at 37 °c for 1 hr. vith 
agitation . 

35. To 200 ml of L3 broth prewarmed to 37°C, add 
0.4 ml of 50 mg/ml ampicillin. 

36. Remove 10 to 100 fil of the "Pan" library culture 
for plating, and transfer the rest (2 ml) to the prewarmed L3 
broth. Plate out several dilutions of each sample on L3 
plates containing ampicillin. Suggested plate dilutions are 
as fallows: Pre — 10~ 5 , 10" 6 and 10" 7 ; and ?an/NC — 10~ 3 , 

15 10 -4 , 10~ 5 and 10~ 5 . 

37. Grow ,f Pan" library at 37 °C for about 4-5 hr. 
until the OD 6G0 = 0.5-l.C. 

33. Chill the flask rapidly in ice vater for at 
least: 10 -minutes. 

20 39. Centrifuge cells in 250 ml sterile bottle at 6X 

for 5 minutes, Backman JA-I4 rotor. 

40. Wash by vortexing cells in 100 ml of cold WTEX. 

41. Centrifuge at 6X for 6 minutes. 

42. Wash by vcrtaxing cells in 50ml cold TEX . 
25 43. Centrifuge at 6X for 6 minutes. 

44. Resuspend cells in 4 ml of HEX and store in two 
2 ml vials at -70°C. Use one tube for the next round; keep 
the other as a backup. 

30 (5) Examination cf Individual Clones bv ELISA 

The binding properties of the peptides encoded by 
individual clones are typically examined after 3, 4, or 5 
rounds of panning, depending on the enrichment numbers 
observed. The most sensitive assay is an ELISA that detects 

35 receptor specific binding by lacl-peptide fusion proteins. 
The lad ELISA can detect binding of peptides that have 
monovalent affinities for the receptor as low as -100 /-M . 
This sensitivity cf the assay is an advantage in that initial 
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hits of low affinity can be easily identified, but is a 
disadvantage in that the signal in the ELISA is net correlated 
with the intrinsic affinity of the peptides. Fusion of the 
peptides to the maltose binding protein (M3?) as described 
below permits testing in a ELISA where signal strength is 
better correlated with affinity. 

a. Reagents for Ivsates 

• Lvsis Buffer (make fresh just before use) 
4 2 ml HI 

5 ml 50% glycerol 

3 ml 10 mg/ml 3SA, protease free, in HE 

12 5 /il 0.1 M PMSr (may include other 
protease inhibitors) 
750 /il 10 rr.g/ml lysczyme in HE 

• 20% L-arahir.ose in dH 2 0, sterile (Important: do not 
use D-arabinose) 

b. Procedure for the Prepara tion o: lad ELISA 

20 Lvsates 

1. Inoculate each individual clone in 1 mi LB-Amp , 
shake at 37°C, overnight. 

2. Dilute 3 00 ul of the culture into 3 ml L3-Amp, 
shake at 37 3 C for 1 hr . 

25 3. Induce with 33 ul of 20% L-arabinose (0.2% 

final), shake at 37*C for 2-3 hrs. 

4. Spin at 4,000 rpm, 5 min, Beckman JS 4.2 rotor. 

5. Decant supernatant, keep cells on ice or at 4°C 
for the rest of the procedure. 

3 0 6. Vortex to resuspend cells in 3 ml 4°C WTEX 

buffer. 

7. Spin 4,000 rpm, 5 min, pour off supernatant. 

3. Vortex to resuspend cells in 1 ml 4°C TEX 
buffer; transfer to 1.5 ml microfuge tubes. 

35 9. Spin 14,000 rpm, 2 min, aspirate supernatant. 

10. Resuspend cells in 1 ml lysis buffer, incubate 
on ice, 1 hr. 
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11. Add 110 u.1 2 M KCl (final concentration of 
0.2 M) to solubilize fusion proteins, invert to mix. Note 
that nost of the lad protein will be present as insoluble 
inclusion bodies that will be part of the pellet discarded in 
step 13. Enough lad protein is soluble to allow a strong 
signal in the ELISA. The KCl helps increase the amount of 
soluble lacl. 

12. Spin 14,000 rpm, 15 ain, 4°C in a microfuge. 

13. Transfer -900 ^1 of the clear crude lysate to a 
new tube. (Store at -70 °C if assay is to be done on another 
day. ) 



c. Reagents fsr ELISA 

• PBT: P3S, 1% 3SA, 0.05% Tween-2C 
15 • PBS/Tween: F53, 0.05% Tween-20 

• Anti-IacI antibody; Rabbit anti-lacl polyclonal can 
be purchased from Stratagene (#217449) . 

• Goat anti-Rabbit IgG and light chains, alkaline 
phosphatase conjugate is from Tago (,#6500} . 

20 • Alkaline phosphatase substrate is p-nitropheny i 

phosphate . 

• Development cut far: 9.5% diethanclarnine, 0.24 tJ! 
MgCl : , pH 9.3 with HCl. 

2 5 d. Procedure for lacl ELISA 

1. Coat nicrotiter wells with the receptor of 
interest. Make equivalent set of minus receptor control wells 
in parallel. Block wells for at least 1 hr with 1% BSA. The 
control wells should be as similar as possible to the receptor 

2 3 coated wells to control for various sorts of nonspecific 

binding by the peptides. The assay is usually performed in 
duplicate or triplicate wells. 

2. Wash plate 4x with 4 3 C PBS/Tween. 

3. Add 100 ul/well crude lysate diluted 1/20 m 
35 P3T; 4 °C, 30 nin, shake gently. 

4. Wash plate 4 x with 4°C PBS/Tween. 

5. Add 100 ^1/weil anti-lad Antibody diluted 

1/ 15,000 in PBT ; 4 a C, 30 mm, shake gently. The dilution of 
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anti-lad given here is based on cur titration of our own 
serua. It may be necessary to use a different dilution of the 
commercially available serun. 

6. Wash plate 4x with 4°C ?3S/Tveen. 
5 7. Add 100 -l/veli goat anti-rabbit alkaline 

phosphatase conjugated Ab diluted 1/3,000 in P3T; 4°C, 33 nin, 
shake gently. 

3. Wash plate 4x with 4°C P3S/Twesn. 

9. Wash plate 2x with 4°C T3S (10 mM Tris pH 7.5, 
10 150 mM NaCl) . 

10. Develop assay, § 200 ^1/vell of I mg/ml alkaline 
phosphatase substrate in development buffer. 

11. Read plate at A 4C5 in microtiter plate reader. 
(Take time point measurements to determine termination time. 

15 Reaction is no longer linear above A 4Q5 -1.0.) 

12. Stop reaction with 50 ,-1/well 2 M NaOH and read 
final result. 

e. Transfer cf selected sequences to maltose 
20 binding crotein 

Coding sequences of interesting single clones or 
populations of clones ar= often transferred to vectors that 
fuse those sequences in frame with the gene encoding M5?. 
This is done for several reasons. First, MB? generally exists 

2 5 in solution as a monomer and the native protein has no 

cysteine residues. The mcnovalency of peptide display allowed 
by M3? fusions causes the M3? ELISA described below to be much 
sore affinity sensitive than the lad 3LISA. Dimers forms 
have been reported for M3? purified under certain conditions. 

3 0 These dimers can be dissociated by the addition of maltose to 

the solution. No substantial difference in the MBP ELI5A 
signal is seen in the presence and absence of 1 mM maltose 
using the protocols listed here, so diner formation under our 
conditions appears unlikely. 
35 The second reason for using M3? is that it can be 

expressed in very large amounts as a soluble protein which is 
easily purified, allowing initial examination cf the 
properties of peptides without the need for chemical 
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synthesis. Third, the :-L3? fusion proteins car. be directed to 
either the cytoplasm (a reducing environment) cr the periplasm 
(an oxidizing environment) of Z. cell using vectors that 
differ only by the presence cr absence of an N-teminal signal 
5 sequence in the gene encoding MB? . Some peptides are 

expressed mere efficiently in one cr the ether of these two 
environments. Fourth; peptide populations linked to KB? can 
be easily screened using colony lifts with a selected 
receptor . 

.0 The cloning of a library into pJS142 creates a 3spZI 

restriction site near the beginning of the random coding 
region of the library. Digestion with SspEZ , Nhel and Seal 
allows the purification of a -9C0 bp DNA fragment that can be 
subcloned into one of two vectors, pZL;-!3 (cytoplasmic) or 

5 ?EIj<l5 (periplasm — ) , wnicn are simple modifications of the 
p>IALc2 and pMALp2 vectors, respectively, available 
commercially from New England Bioiabs. Digestion of pZLM3 and 
pZI_M15 with Age! and Seal allows efficient cloning of the 
SspZZ-ScaZ fragment frcm. the pJS14 2 library. The 3spZl and 

- Age I ends are compatible for ligation. In addition, correct 
ligation of the Seal sites is essential to recreate a 
functional bia (Amp resistance) gene, thus lowering the level 
of b 3 ck ground clones from undesired ligation events. 
Expression of the zac prcmoter-cr i ven MBP-pectide fusions can 

5 the- oe induced with IP73. 



f . Procedure for Subclonina into >f3? Vectors 

1. Digest pELM3 or p ELM 15 with Acsl and Seal. 
Purify the 5.5 kb fragment away from the 1.0 kb fragment. The 
digest is generally rur. in a agarose gel, and the 
appropriate region of the eth idium bromide stained gel excised 
under low-intensity long wave UV illumination, and run on a 
new gel. Electrophoresis in the second gel yields an 
additional purification of the desired fragment and leads to 
lower background in the ligation. Elute the DNA from the gel 
fragment using a Geneclean kit (Bio 101) . 

2. Remove a 5-50 ml portion frcm the 200 ml PAN 
amplification culture before harvesting the cells. Allow the 
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removed portion to grov to saturation overhigh-. Prepare DNA 
from the cells and digest vith 3spZl and Seal, Purify the 
0.9 kb BspEI-Scal fragment iron the 3.1 and 1.7 kb vector 
fragments as described above. 
5 3. Ligate an aquinolar nix of the two fragments at 

a final DNA concentration of -50 Aig/ml with T4 DMA ligase in 
standard ligase buffer containing 0.4 mM ATP (the higher 
levels of AT? found in most ligase buffers inhibit efficient 
ligation of the Seal blunt ends). Incubate at 14°C overnight. 

10 4. Inactivate ligase at 65 °C for 10 min. To lower 

background from religaticn of the parental vector, digest the 
ligation nix with Xbal. Iscpropanol-precipitate the ligation 
mix using l ^1 of glycogen as carrier, wash carefully vith 80% 
ethanol, and resuspend the dry pellet in 20 p.1 dH 2 0. 

15 Transform ART 314 with 1 ul, and plate on L3-Amp plates. 

g. Procedure for M5? ELISA 

The cell lysates for the M3? ELISA are prepared by 
the same procedure as the lacl ELISA lysates, except that the 
20 induction is dene with a final concentrat ion of 0.3 mM TG 

instead of L-arabinose. The ELISA is performed as described 
for lacl above with the following exceptions: 

1. Lysates are diluted 1/50 for addition to the 

wells. 

25 2. Primary antibody is 1/10,000 diluted polyclonal 

rabbit anti-M3? (available from New England 3iolabs) . 
Incubation is for 15 instead of 30 min. 

3 . The secondary antibody incubation is also for 15 
instead of 30 min. 

4. Development of the assay generally takes longer 
than the lacl ELISA , generally 30-50 min. 



Although the foregoing invention has been described 
in some detail by way of illustration and example for purposes 
3 5 of clarity of understanding, it will be apparent that certain 
changes and modifications may be practiced within the scope of 
the appended claims. 
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The cell lines described in the application as having been 
deposited at the ATCC will be maintained at an authorized 
depository and replaced in the event of mutation, nonliability 
or destruction for a period of at least five years after the 
;nost recent request for release of a sample was received bv 
the depository, for a period of at least thirty years after 
the date of the deposit, or during the enforceable life of the 
related patent, whichever period is longest. Ail restrictions 
on the availability to the public of the ceil lines will be 
irrevocably removed upon the issuance of a patent frora the 
above-capticned aoDlication . 
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sequence listing 



(I) GENERAL INFORMATION: 

; 1} APPLICANT: Schatz, Pa.er J. 

Cull, Millard G. 
Miller, Jeff ?. 
Steamer, Willem ?.C. 
Gates, Christian M . 

(ii) TITLE OF INVENTION: Peptide Library and Screening Method 

(iii) NUMBER 0? SEQUENCES: 162 

(iv) CORRESPONDENCE ADDRESS : 

(A) ADDRESSEE: William M. Smith 

(3) STREET : One Market Plaza, Stauart Tower, Suite 2CGC 

(C) CITY: San Francisco 

(D) STATE: California 

(E) COUNTRY: USA 
(?) ZIP: 94105 

(v) COMPUTER READ A3 LE FOP-M: 

(A) MEDIUM TYPE: Floppy disk 
(3) COMPUTER: I3M ?C compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-COS 

(D) SOFTWARE: Patentln Release ^1.0, Version ?I-25 

(vi; current application oata: 

(A) APPLICATION NUM3ER: US 05/545,543 
(3) FILING DATE: 25-CCT-I995 
(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION I ATA: 

(A) APPLICATION NUMBER: US 33/290,541 
(3) FILING DATE: 15-AUC-1994 

fvii". PRIOR APPLICATION 2 ATA : 

(A) APPLICATION NUMBER: US 27/952 ,521 
(3) FILING DATE: I3-CCT-1992 

(viiii ATTORNEY /AGENT INFORMATION: 
(A) NAME: Smith, William M . 
(3) REGISTRATION NUM3ER: 30,225 
(C) REFERENCE /DOCKET NUMBER : 1 5 5 2 5 J-0C 1 2 4 21 :> 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: 415-325-2400 
(3) TELEFAX: 415-326-2422 
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(2) I NT ORMAT ION FCR SEQ ID NC : 1 : 

(i) SIQCENCE CHARACTER I ST rCS : 

(A) LENGTH: 14 amino acids 
(3) TYPE: amino acid 
(C) STRANDZDNESS: single 
(0) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO:l: 

Gly Ala Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
15 10 



(2) INFORMATION FOR SEQ ID NO: 2: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 54 base pairs 
(3) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SZQ^ICZ DESCRIPTION: SEQ ID NO : 2 : 
GTGGCCCCNN XNNXNNXNNX NNXNNXNNXN NXNNXNNXNN XNNXTAAGGT CTCG 



INFORMATION FCR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: ll base pairs 
(3) TYPE: nuclei; acid 

(C) STRAND EDNESS : 3 ingle 

(D) TOPOLOGY : linear 

'ii) MCLECULE TYPE: DNA 

SEQUENCE DESCRIPTION: SEQ ID N 0:3: 
ACCGCC G 



(2) INFCPMATICN FOR SEQ ID N 0 : 4 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 14 case pairs 
(3) TYPE: nucleic acid 
(C) STRANDED NESS : 3ingle 
( D J TOPCLCGY: linear 

(ll) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 
ATTCCACAGC TCGA 
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(2) INFORMATION ?OR SZQ ID NO: 5: 

(1} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 anir.o acids 
(3) TYPE: aniino acid 

(C) STRANDZ0N2SS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ 13 NO: 5: 

Lau Glu Ser Giy Gin Gly Ala Asp Gly Ala 
5 10 



(2) INFORMATION FOR SZQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 45 base cairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION : SEQ 10 NO : 6 : 
^CAOCZ GCCAGGGGGC CCACZZZQCZ TAATTAATTA 



:.) INFORMATION FOR SZQ 10 NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 a.nr.c acids 
(3) TYPE: amino acid 
( C) STRAND2DN2SS : single 
(D) TOPOLOGY: linear 

■;ii) MOLECULE TYPE: peptide 

IMMEDIATE SOURCE: 
(3) CLONE: dyn3 1.3 

;:<i) SEQUENCE DESCRIPTION: SZQ ID NC : 7 : 

Tyr Gly Gly ?he Leu Arg Arc. Gin ?he Lvs Val Val T 

5 :6 



) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 a,T.ir.o acids 
{3} TYPE: amino acid 

(C) STRANDZDNESS: single 

(D) TOPOLOGY: linear 

(ii.) MOLECULE TYPE : peptide 

(vii) IMMZDIATZ SOURCE: 

(3) CLONE : 21 4 1.2 

',':<-) SZQUZNCZ DZSCRIPTIDN: SEQ ID NO : 3 : 

Thr Gly Lys Arg Gly ?r.e Ly 3 Val Val Cvs Asn 

5 10 
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(2) information for szq id no: 9 : 
(i) sequence characteristics: 

(A) LENGTH: 12 anir.o acids 

(3) TYPE : amino acid 

(C) STRANDEDNESS: single 

{ D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE : pep- id 3 

(vii) IMMEDIATE SOURCE : 

(3) CLONE: 22 4 1.2 

(xi) SEQUENCE DESCRIPTION: SEQ 13 NO: 9: 

Lva Azg Asn Phe Lys Val Val Gly Ser Pro Cys Giy 
1* 5 10 



(2) INFORMATION FOR SEQ ID NO: 10: 

(L) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 ammo acids 
(3) TYPE: amine acid 

(C) STRAND ED NESS : sir.gie 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pec. ice 

;•;:;] IMMEDIATE SOURCE: 

(3) CLONE: 10 4 0.3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NC : 10 : 

Ser Asc Ser GIv A3n Glv Leu Civ He Arc .Arc Phe Ly3 Val Ser Ser 
3 10 15 

Leu Ala Val Leu Ala Asp Glu Arc Arc Pns Ser Ala 
20 2 5 



(I; INFORMATICS FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 12 ac-.i.no acids 
(3) TYPE: amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pepcide 

(vii) IMMED IATE SOURCE: 

(3) CLONE: 30 4 1.5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Glv Thr Arc Pro Phe Lvs Val Ser Glu Tyr lie Leu 
I 5 10 
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(2) INFORMATION TOR S2Q ID NO: 12: 

( i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 a.— r.c acids 
(3) TYPE: amino acid 
(C) STRANDEDNESS: sir.gle 
(3) TOPOLOGY: li.-ear 

(ii) MOLECULE TYRE: peptide 

(vii) IMMEDIATE SOURCE : 

(3) CLONE: 35 4 C.2 

<xi) SEQCZNCE DESCRIPTION: SZQ ID NO: 12: 

Ser Leu Lys Asp Glu Asr, Asr. Lys Arg Arg lie Phe Lvs Vai Ser Ser 
1 5 10 15 ~ 

Leu Ala Val Leu Ala Asa Ciu Arg Arg Phe Ser Ala 
20 * 25 



(2) INFORMATION EOR SEQ ID NO:Ij: 

(i) SEQUENCE CHARACTERISTICS : 

(A} LZNGTH: 12 a^mc acids 
(3) TYPE: ammo acid 

(C) STRAND ZD NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pep-ide 

(vii) IMMEDIATE SOURCE: 

( 3 ) CLONE : 5 7 2 Z . 9 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 13: 

Ser Tyr Lau Arg Arg Gl- Pr.s Lvs Vai Ser Civ Vai 
1 5 10 



;i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 a-.ir.c 3cids 
(3) TYPE: anir.c acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPZ: peptide 

{vii} IMMEDIATE SOURCE: 

(3) CLONZ: 2 4 4 0.9 

(xi) SEQUENCE DESCRIPTION : SZQ ID NO: 14: 



Gly Trp Arg Ser Cvs Pre Arc Gir. Phe Lys Val Thr 
- 5" 10 
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(2} information FOR SEQ ZD NO: 15: 

fi) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 aaino acids 
(3) TYPE: am^no acid 
(CJ STRAND3DNSSS: single 
P) TOPOLCGY: linear 

(ii) MOLECULE TYPE : peptide 

(vii) IMMEDIATE SOURCE : 

(3) CLONE: 45 3 0.9 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

lie Lys Ajrg Gly Phe Lys lie Thr Sec Ala Met Ser 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

MOLECULE TYPE: peptide 

IMMEDIATE SOURCE: 
(3) CLONE: 47 3 0.3 

SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Arg Phe lie Ala Arc. Pro Phe Arc He Tr.r Glv 
5 10 



2) INFORMATION FOR SFQ ID NO : 1 7 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 amine acids 
(3) TYPE: aa.no acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

ivii) IMMEDIATE SOURCE : 

(3) CLONE: 71 2 1.1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Ala Arc. Ala Phe Arc Vai Tnr Arg He 



Ala Glv Vai 
10 



WO 96/40987 



PCT/US96/09809 



aa 

(2) INFORMATION FOR SZQ ID NO: 13: 

( i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 ar.ir.a acids 
(3) TYPE : amino acid 
(C) STRAND ZD NFS S : single 
f D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE : pepclde 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 74 2 C.2 

{xi ) SEQUENCE DESCRIPTION: SZQ ID NO: 13 : 

Lys Asn Giu Thr Arg Arg Pro ?he Arg Gin Thr Ala 
15 10 



(2) INFORMATION FOR SZQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 
( C) STRANDEDNESS: Single 
{ D ) TOPOLOGY: lir.dir 

(ii) MOLECULE TY?Z: pep.ida 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 53 2 0-6 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 19: 

Val A3n Hia Arg Arg ?he Ser Val Val His Ser Tv: 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTER! ST I CS : 

(A) LENGTH: 12 an. no acids 
(3) TYPE: ammo acid 

(C) STRANDEDNES S : single 

(D) TOPOLOGY: lir.sir 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE : 

(3) CLONE: 4 3 3 0.4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Val Ser Ser Ser Arg Thr Phe Asr. Val Thr Arc Acc 
1 5 10 
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(2) INFORMATION FOR SEQ ID NC:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A). LENGTH: 12 amino acids 
(3) TYPE : amino acid 
(C) STRANDEDNESS: single 
(DJ TOPOLOGY : linaar 

(ii) MOLECULE TYPE: pepcide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE : 45 3 0.3 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 21 : 

Giy Arg Ser Phe His Vai Thr Ser Phe GLy Ser Val 
15 10 



(2} INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3} TYPE: amino acid 

(C) STRANDED NESS : single 

( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SCL'RCE: 

(5) CLONE : 4 4 1.1 

(xi) SEQUENCE DESCRIPTION: SEQ 12 NO:22: 

Arc Ser Thr Thr Vai Ar 3 Gin His Lys Val Val Gly 



INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

;A) LENGTH: 12 amine acids 
(3) TYPE: amino acid 
(C) STRANDEDNZS3 : single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pep-ide 

(vii) IMMEDIATE SOURCE: 

{3] CLONE: 15 4 1.2 

(xi) SEQUENCE DESCRIPTION 

Glu Arg Pro Asn Arg Leu 



: SEQ 10 NO:23: 

His Lys Val Vai His Ala 
10 
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(2) INFORMATION FOR SEQ 13 NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 aaino acids 
(3) TYPE: amino acid 

(C) STRANDEDNZSS: single 

(D) TOPOLOGY: linear 

{ ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE : 73 2 0.5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Trp Gin Asn Arg Thr His Lys Vai Vai Ser Gly Arg 
i 5 ' 10 



(2) INTORKATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 asiir.o acids 
(3) TYPE: amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 7 3 2 1.1 

(xi) SEQUENCE " DESCRIPTION : SEQ ID NO: 25: 

Ala Arg Lys His Lys Val Tnr 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : II a-.ir.c aciis 
( 3 } TYPE: a-T.ino acid 
(C) STRANDEDNZSS : single 
£D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pep-ida 

(vii] IMMEDIATE SOURCE : 

(3) CLONE: 40 3 1.1 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO : 2 5 : 

Arg Gin Val Thr Arg Leu His L v 3 Vai lie His 
1 5 10 
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(2) INFORMATION FOR SEQ I- NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) L2NGTH: 12 anmo acids 
(3) TYPE: amino acid 

(C) STRANDED NES S : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

{3} CLONE: 11 4 1.0 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Cys Pro Gly Giu Arg Met His Lvs Ala Val Arg Ala 
15 10 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 amir. a acids 
(3) TYPE: amine acid 

(C) STRANDEDNE3S: sir.gle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 2 4 l.C 

{xl) 5EQUE NCI DESCRIPTION: SEQ ID NC:23: 

Ser Arc Cys Arg Asn His .Arg Val Val Thr Ser Gin 
1 - S 10 



(I; I N FORMAT I ON FOR SEQ 10 NC:29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 am— ,o acids 
(3) TYPE: amr.c a: id 

( C) STRANDEDNESS : smgla 

( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 2 5 4 0.3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 : 
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(2} INTORMATION FCR SEQ ZD NC : 3 3 : 

( i.) sequence characteristics : 

(A) LENGTH : 12 amino acids 

( B } TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 9 4 0.3 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 30: 

Giu lie Arg Arg His Arg Val Thr Glu Arg Val As? 
15 10 



{2} INFORMATION FOR SEQ 10 NC : 3 1 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 anir.o acids 
(3) TYPE: amino acid 
(C) STRAND EDNESS : single 
{ D ) TCPCLCGY : linear 



.) MOLECULE 



/ii J IMMEDIATE SOURCE: 

(3) CLONE: 56 3 1 . \ 

(xi) SEQUENCE DESCRIPTICN: 3ZQ ZD NC : 3 1 : 

Leu Arg Arg Leu His Arg Val Thr Asr. Thr Me: 

i 5 :o 



INFORMATION FOR SZQ ID NO: 32: 

(1) SEQUENCE CHARACTERISTICS : 

[A) LENGTH: 12 a.-. i r. c acids 
;3) TYPE: ammo acid 

[C) STRAND ZD NESS : single 

( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pec.ide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 69 2 1.1 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO : 3 2 : 

V a 1 L v 3 Gin Arg Leu His 5 e r Val Va 1 Ar g Pro G . y 
1*5 1C 
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(2) INFORMATION FOR SZQ ID NC:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amine acids 
(3) TYPE : amino acid 

( C) STRAND Z D NE S S : single 

(D) TOPOLOGY : lineir 

(ii) MOLECULE TYPE: pep- ida 

(vii) IMMEDIATE SOURCE: 

{ 3 ) CLONZ: 7 4 I.I 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 

Val Thr Gin Arg Val Arg Ser Asn Lys Val Vai Ser 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amine acids 
(3) TYPE: amine acid 
(C) STRAND EONESS : single 
{ D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

;v_i) IMMEDIATE SOURCE: 

(3) CLONE: 20 4 1-1 

(:<i) SEQUENCE DESCRIPTION : 512 ID NC:34: 

Va' Giu Lvs lie Lvs Arg Leu As- Lys V, 
* 5 10 



,; 2 ) IN FORMATION FOR SEQ ID NO : 2 5 : 

:i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amine acids 
(3) TYPE: amino acid 

(C) STRAND ED NESS : sir.gle 

(D) TOPOLOGY: linaar 

(ii) MOLECULE TYPE: peptide 

(vii} IMMEDIATE SOURCE : 

(3) CLONE: 23 - 1-2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 

Arg Leu Lys Thr Arg Leu Asn Lys Val Val Me: 
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(2) INFORMATION TOR SEQ 10 NO : 3 5 : 

(i) SEQUENCE CHARACTERISTICS: 

{A} .LENGTH: 12 ar.i.no acids 

(B) TYPE: amino acid 

(C) STRAND EONZS 5 : single 

(D) TOPOLCGY: linear 

(ii) MOLECULE TYPE: pepcide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 63 2 0.4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

Val Arg Met Asn Lys Val Val Cys Giu Lys Leu Tr? 
15 10 



(2) INFORMATION FOR SEQ ID NO: 37: 

{ i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 ammo acids 
(3) TYPE: araino acid 

(C) STRANDEDNESS: single 

(D) TOPOLCGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 49 3 0.3 

(xij SEQUENCE DESCRIPTION: SEQ ID NC:37: 

Aso Leu Lvs Arg Leu Asn Arg Val Val Glv His 
1 " " 5 10 



INFORMATION FOR SEQ 10 SO: 33: 

;i; sequence characteristics: 

[ A ) LENGTH: 12 ar.:r.o acids 

(3) TYPE: amino acid 

(C) STRANDEDNESS : single 

(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pepc ide 

[vii) I MMED I ATE SOURCE : 

(3) CLONE: 19 4 0.3 

(xi) SEQUENCE DESCRIPTION': 

Arg lis Ar g A 3 n Asn — y s V 
1 5 



SEQ 10 NO:33: 



:ie Ala Arc 
10* 



WO 96/40987 



PCI7US96/09809 



95 

(2) INFORMATION FOR SZQ ID NC:39: 

(i) SEQUENCE CHARACTER: ST I OS : 

(A) LENGTH: 12 asiir.o acids 
(3) TYPE: amino acid 
(C) STRANDSDNESS: single 
{ D } TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE : 36 4 0.5 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 39: 

Ser Arg Val Arg Ser Asn Lys Vai lie Met Ser lie 
1 5 * 10 



(2) INFORMATION FOR SZQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 12 anir.c acids 
(3) TYPE: araino acid 

(C) STRANDZDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pep-ide 

(vii) IMME D I ATE S C URCZ : 

(3) CLONE: 7 7 2 0.5 

(xi) SEQUENCE DESCRIPTION : SZQ 13 NC:4C: 

Ser Cys Arg Leu Asn Lvs Val lie Ala Arg Pre Val 
1 5 10 



(2) INFORMATION FOR SZQ ID NC : 4 1 : 

ii) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 12 a^mo a rid 3 
(3; TYPE: a-T.ino acid 
(CJ STRANDEDNZSS : single 
(D) TOPOLOGY : linear 

(li) MOLECULE TYPE: peptide 

(vii) I MM ED I ATE SO URCZ : 

(3) CLONE: 3 3 4 C . S 

(xi) SZQUZNCE DESCRIPTION: SZQ ID NO: 41: 

Arg Ala Leu Ser Lys Asc Arg Leu Asn Lvs Val Thr 
1 S 10 
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(2) INTORMATION POR SZQ ID NO: 42: 

(i) SZQraiCZ CHARACTERISTICS: 

(A} LENGTH: 12 ammo acid3 
(3) TYPE : amino acid 

(C) STRAND2DN2SS: single 

( D ] TOPOLOGY: linear 

(ii) MCLZCUXZ TYPE: peptide 

(vii) IMHZDIATZ SOURCE: 

(3) CLONE: 5B 3 1.1 

(xi) SZQU2NC2 DESCRIPTION: SZQ ID NO:42: 

Cys Thr Thr Giu Arg Ser Arg Gin Trp Lys Val Thr 
1 5 10 



(2) INFORMATION FOR SZQ ID N'C:43: 

(i) SZQUZNCZ CHARACTERISTICS : 

(A) LZNGTH: 12 amino adds 
(3) CYPZ: amine acid 

(C) STRANDZDNZSS: single 

(D) TCPCLCCY: linear 

(ii) MOLZCUXZ TYPE: peptide 

[v:i) IMMEDIATE SOURCE: 

[3) CLONE: 15 4 1.1 

(xi) SZC/JZNCE DESCRIPTION: SEQ -D NO: 43: 

Ala Arg Pro Trrj Lys He Thr Arg A sr. CLj ?r: Cly 
1 '5 13 



INFORMATION TOR SEQ ID NO: 44: 

(_) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amine acid3 
(3) TYPE: amine acid 
(C) STRAND ED NESS : single 
(0) TOPOLOGY : linear 

[ii) MOLECULE TYPE: peptide 

(vii) IHXED I ATE SOURCE: 

(3) CLONE : 7 2 2 0.3 

(xi) SEQUENCE OESCRIPTION: SEQ 10 NO: 44: 

Civ Val Ser Glu Cys Arg Lvs Tro Lys He Vdl Gin 
1 5 10 
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(2) INFORMATION FOR SZQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE : amino acid 

(C) STRANDEDNSSS: 3ir.gle 

(D) TOPOLOGY: linear 

(ii) MOLZCULZ TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CXONE: 6 4 1.2 

{xi) SEQUENCE DESCRIPTION: SZQ ID NO: 43 : 

Thr Thr Leu Arg Arg Tvr Lys Val Thr Gly Glu Arg 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amine acids 
(3) TYPE: amino acid 
(C) STRANDED NESS : single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii} IMMEDIATE SOURCE : 

(3) CLONE: 3 4 4 l.i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 6 : 

lie Ala Asp Arg Arg Pro Ty r Arg Val Thr Arg Pro 
- 3 10 



(2) INFORMATION FOR SEQ ID NO: 47: 

{i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 arr.inc acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

{3) CLONE: 7 5 2 1.2 

(xi) SEQUENCE DESCRIPTION : SEQ ID NC : 4 7 : 



Ala 



Gly Lys Val Leu Arg Ala Tyr Lvs lie Val Glu 
5 10 
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(2) INFORMATION FOR SEQ IE NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 aaino acids 
(3) TYPE: amxno acid 

(C) STRAND ED NZS S : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(BJ CLONE: 3 4 i.G 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

Gin Lys Arg Leu Met: Lvs Val lie Phe Giu Glv Arc 
1 5 10 * 



(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acida 
(BJ TYPE: amino acid 
(C) STRANDEDNESS : single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

;vii) IMMEDIATE SOURCE: 

(3) CLONE: 5 5 2 1.0 

ICS: SZQ ID NO: 4 9: 

*.e Arg Trp Thr Lys His .yet 
10 



INFORMATION FOR SEQ ZZ NO : 3 C : 

(i) SEQUENCE CHARACTERISTICS: 

■A) LENGTH: 24 ar,;r.c acids 
(3) TYPE: a^;nc acid 
(C) STRANDEDNESS : single 
( 0 } TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 13 4 C. 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NC:50: 

Ser Thr Thr Giu Arg Arg Ser Phe Lvs Val Ser Ser Leu Ala Val Leu 
1 5 10 15 

Ala Asp Giu .Arg Arg Phe Ser Ala 

2G 



(xi) SEQUENCE RESCRIPT 

Giu Val Pro His Arg ? 
1 5 
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(2) i^ormattcn for szq 12 :;c:5i: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 ar.ir.o acids 
(3) TYPE: amino acid 

(C) STRAN3EDNSSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 14 4 0.2 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 51: 

Arg Leu Pro Gly Arg Met Phe Lys Val Ser Ser Leu Ala Val Leu Ala 
1 5 ' 10 15 

Aso Glu Arg Arg Phe Ser Ala 
20 



(2) INFORMATION FCR SZQ ID NO: 52: 

(i) SZQUZNCZ CHARACTERISTICS : 

(A) LZNGTH: 12 amine acids 
(3) TYPZ: a^mc acid 
(C) STRANPZDNESS: single 
(3) TOPOLOGY: linear 

;ii} MOLECULE TYPZ: pep-ide 

( v i i } I MMZ D I AT Z SOURCE : 

(3) CLCMZ: 23 4 C . 1 

(:ci) SZQUZNCZ DESCRIPTION: SZQ 10 NC:E2: 

Val Gly 3ar Phe Lys Arc. Thr Phe Lys Val Ser Cys 
1*5* :: 



(2) INECP.MATICN POP SZQ ID NO: 53: 

(i) SZQUZNCZ CHARACTERISTICS : 

( A } LZNGTH: 21 amino acids 
(3) TYPE: amino acid 
(CJ STRAND S3 NESS : single 
(0) TOFCLCGY: linear 

(ii) MCLZCU~LZ TYPZ: ret tide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 29 4 0. 1 

(xi) SZQUZNCZ DESCRIPTION : SZQ ID NO: 53: 

Arg Civ Arg Met Phe Lvs Val Ser Ser Leu Ala Val Leu Ala As? Glu 



Arg .Arg Phe Ser Ala 
20 
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(2) INFORMATION FOR SEQ ID NO: 54: 

(i) seqo2;;c=: characteristics : 

(A) LENGTH: 29 arjino acids 

<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 54 3 0.1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Pro Giy Arg Trp Val Arg Giy Val Giy lie Arg Cvs ?he Lys Val Se- 
1 5 10 ' 15 

Ser Lau Aia Val Leu Ala As- Glu Arg Arg Phe Se- Ala 
20 25 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 ariir.c acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pepcide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 60 2 I . : 

(xi) SEQUENCE DESCRIPTION': SEQ ID NO: 55: 

Arg Me- Ser Arg Lau Pne Lys Val Ser Ser Leu Ala Val Leu Ala Asp 



Clu Arg Arg Phe Ser Ala 
20 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS : 

(AJ LENGTH: 23 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D] TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

{vii) IMMEDIATE SOURCE: 

(3) CLONE: 1 4 C . 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 5 : 

Pro Asp Val Leu Arg Ala Val Ala Thr Arg Gin His Lvs Val Ser Ser 
1 5 10 " 15 

Leu Aia Val Leu Ala A3 - Gi- Arc Arc Phe Ser Ala 
20 " " 25' 
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(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 12 amino acids 
(3) TYPE: amino acid 

(C) STRAND ZD NES S : 3 ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLCNZ: 27 4 0.2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Arg Val Arg Gly His Arg Vai Val Met Tyr Asn Glu 
1 5 10 



(2) INFORMATION FOR SEQ ID NO:53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 
(C) STRAND ED NESS : single 
(0 ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SCCRCE : 

(3 J CLONE: 54 2 C.I 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 3 : 

Glu Cys Leu His Arg Arg Val His Lvs lie Leu Ser 
1 5 10 



{2} INFORMATION FOR SEQ ID NO: 59: 

;i) SEQUENCE CHARACTERISTICS: 

{ A ) LENGTH: 12 a-_r.c acids 
(3) TYPE: amino acid 
■C) STRAND ED NESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 51 2 C.l 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 59: 

Gly Leu Lys Cy3 Arg Pre Hei Lys Val Asn Ala Asp 
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(2) INFORMATION FOR SEQ ID NC: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDED NZS S : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE : 50 3 C.i 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 50: 

Arg His Arg Pro Phe Glv Tro Val Asn Lvs Arg 5er 
1 5 10 



(2} INFORMATION FOR SEQ ID NO: 51: 

f i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amine acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pepcide 

(vii) IMMEDIATE SOURCE: 

(3) CLCNE: 52 3 0.2 

(:ei) SEQUENCE DESCRIPTION: SZQ ID NO : 5 1 : 

Ala Ala Arg Leu Phe Ser Gin I la Arg Arg Phe Pre 



(2) INFORMATION FOR SEQ ID N'G:52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 
(C) STRANDEDNES5 : single 
{ D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

( 3 ) CLONE : 5 3 3 0.1 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 52: 

Arg Val Arg ?ro His Met Val Tr.r Gly Asp Ly s Gly 
1 5 10 
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(2) INFORMATION FOR SEQ ID MO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 aa.no acids 
(3) TYPE: amino and 

(CJ STHAND2DNZSS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 31 4 C.l 

(xi) SEQUENCE DESCRIPTION: SEQ ZD NO: 63: 

Arg Phe Arg Asn Cys Ser lie lie Ser Ala Arg Cly 
1 5 10 



(2) I^ORMATION FOR SEQ ID NO: SI: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3} TYPE: amino acid 
(C) STRANDEDNES5: single 
(O) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE : 

(3) CLONE: 62 2 C.l 

ON: SEQ ZD NO:54: 

e val Ala His Gin Leu Met 

10 



(2; INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 5 arr.ir.o acids 
(3) TYPE: amine acid 
{ C) STRAND ID NESS : single 
{ D ) TOPOLOGY: linaar 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ZD NO: 55: 

GIv Ala Aso Gly Ala 
I 5 



(2) INFORMATION FOR SEQ ID NO: 56: 

( i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 ar.mc acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

;ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
Arg Gin Phe Lys Vai Val Tnr 



(xi) SEQUENCE DESCRIPTI 

Tyr Gly Val Pro Arg II 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 67; 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 aunino acids 
(3} TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 57: 
Gly Lys Arg Xaa 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 53 base pairs 
{ B } TYPE: nucleic acid 
(CJ STRANDEDNZS3: singla 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
GC GO OCT AGO TAACTAATGG AG GAT A CAT A AATGAAACCA GTAACGTTAT ACG 53 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTER I 3T ICS : 
(A) LENGTH: 45 tase pairs 
(3) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:59: 
CGTTCCGAGC TCACTGCCCG CTCTCGAGTC GGGAAACCTG TCGTGC 4 5 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 51 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY: lir.aar 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTICN: SEQ ID NO: 70 : 
CCTGCATATG AATTGTGAGC CCTCACAATT CGGTACAGCC CCATCCCACC C 51 
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(2 J INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 51 case pairs 
(3) TYPE: nucleic acid 

(C) STRAND ZD NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
CGCCATCGAT CAATTCTGAG CGCTCACAAT TCAGGATGTG TGTGATGAAG A 51 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 72 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNSSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 2 : 
TCGAGAGCGG GCAGGGGGCC GACGGCGCCT ACGGTCGTTT CCTGCCTOCT CA 0 TT CAAA G 5C 
TTCTAACCTA AT 72 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 2 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
{ D J TCPOLCCr: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SIO/JZ NCI DESCRIPTION: SEQ ID NO: 73: 
CTAGATTAGG TTACAACTTT GA^CTCACCA CGCAGG AAAC CACCGTAGCC CCCGTCGGOC oC 
CCCTCCCCGC TC l2 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: Iz base pairs 
(3) TYPE: nuclei: acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 4 : 
GGGCCTAATT AATTA * 3 
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{2) INFORMATION FOR SZQ ID NO: 75: 

(i) szquencz CHARACTERISTICS: 
(A) LENGTH: 22 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYRE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
AGCTTAATTA ATTAGCCCCC GT 



(2) INFORMATION FOR SEQ ID NO: 76: 

< i} SEQUENCE CHARACTERISTICS: 
(A) LENGTH : 54 base pairs 
(3) TTPE: nucleic acid 

(C) STRAOTEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NC : 7 5 : 
CTGGCGCCNN XNNXNNXNNX NNXNNXNNXN NKNNXNNXNN KNNXTAAGCT CTCG 



(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: II basa pairs 
(3) TYPE: nucleic acid 

(C) STRANDED MESS : single 

(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 77: 
GGCGCCACCG T 



(2) INFORMATION FOR SEQ ID NC:73: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 14 base cair3 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 75: 



AGCTCGAGAC CTTA 



14 
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(2) INFORMATION FOR SEQ ID SO: 73: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
TATTTGCACG GCGTCACACT T 21 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

( B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
CCGCGCCTGG GCCCAGGGAA TGTAATTGAG CTCCGCCATC CCCCCZ7 



(2) INFORMATION FOR SEQ ID NO:37. : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 6 2 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:Sl: 
CG ACCGCGGA GCTCAATTAC ATTCCCNNKN NXNNXNNXNN XAAACCAGTA ACCTTATACG 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 59 base pairs 
(3) TYPE: nucleic acid 

(C) STRAND ED NHS S : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 
CGA7GGCGGA GCTCAATTAC ATTCCCNNKN NXNNKNNXAA ACCAGTAACG TTAT 



WO 96/40987 



PCT/US96/09S09 



103 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 
(A). LENGTH: 72 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 
CGCCCGCCAA GCTTAGGTTA CAACTTTGAA CTGACGMNNM NNMNNMNNCG GAATGTAATT 60 
CAGCTCCGCC AT "2 

(2) INFORMATION FOR SEQ ID MO: 34: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 32 base pairs 
(3) TYPE: nucleic acid 
(CJ STRAND ED NESS: sLngle 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 
GAATTCAAT7 GTGAGCCCTC ACAATTGAAT TC 32 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: II ba3e pairs 
{ 3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SZQ^ZUCZ DESCRIPTION: SEQ ID NO: 35: 
ACCACOTCCG G 11 

(2) INFORMATION FCR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 11 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: 3ingle 

( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 6 : 
TTACTTAGTT A 11 
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(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 anir.c acids 
(3) TYPE: amino acid 

(C) STRANDSDNSSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pep-ide 

(xi) SEQUENCE DESCRIPTION: SEQ 10 NO:S7: 

Mat: Lys Pro Val Thr Leu Tyr Asp Val Ala Giu Tyr Ala Glv Val Ser 
1 5 10 15 

Tyr Gin Thr Val Ser Arc, Val Val Asn Gin Ala Ser His Val Ser Ala 
20 25 30 

Lys Thr Arg Giu Lys Val Giu Ala Ala Met Ala Glu Leu Asn Tyr lie 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 93 base pairs 
(3) TYPE: nucleic acid 

(C) STRAND ZDNSS3 : dcuiie 

(D) TOPOLOGY: linear 

{ii} MOLECULE TYPE : DNA 

(ix) FEATURE : 

(A) NAME /KEY : CDS 
(3) LOCATION: 1. . 34 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 

CTC GAG AGC GGG CAG GTG GTG CAT GGG GAG CAG GTG GOT GCT GAG GCC 
Leu Glu Ser Glv Gin Val Val His Glv Glu Gin Val Glv Gly Glu Ala 
1 5 10 * 15 

TCC GGG CCC GTT AAC GGC CGT CGC CTA GCT CGC CAA TAAGTCGAC 
Ser Gly Ala Val Asn Gly Arc Glv Leu Ala Gly Gin 
20 * 25 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 3 aninc acids 
(3) TYPE: amino acid 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: crctain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39 : 

Leu Glu Ser Glv Gin Val Val His Gly Giu Gin Val Gly Gly Glu Ala 
1 5 10 15 

Ser Gly Ala Val Asn Gly Arg Glv Leu Ala Gly Gin 
20 ' "25 
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(2) information ?or seq id nc:15C: 
(i) sequence characteri stics : 

(A) LENGTH: 13 amino acids 

(3) TYPE : amino acid 

(C) STRANUEDNESS : single 

{D} TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 150: 

Lvs Tr-D Ser Gly Leu Gly Gly Glv Arg Val Leu Val Asn 
1 J " 5 10 



(2) INFORMATION FOR SEQ ID NO: 151: 

( i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE : amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 
Arg Arg Trp Ala Thr Ser Gly Pro Arg Gin la- Tyr 



(2) INFORMATION FOR SEQ ID NC:152: 

{1} SEQUENCE CHARACTERISTICS: 

(A} LENGTH: 13 amino acids 
(3} TYPE: amino acid 
f C] STRANDEDNESS : 3 ingle 
( D } TOPOLC-GY: linear 

) MOLECULE TYPE: peptide 

} SEQUENCE DESCRIPTION: SEQ ID NO: 1=2: 

u o-o Lvs Phe Lvs Asr. ?he Arg Val Val ?he Gin Asn 
5* 10 



ORMATION FOR SEQ ID NO: 15 2: 

) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 13 amino acids 
(3) TYPE: amine acid 

(C) STRANDEDNSSS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 5 3 : 
Arc -rp ?he Ser Pro Giv Arg Arg Ala Phe Met Val Asn 



(xi 
Gl 

(2) INF 
(i 
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(2) INFORMATION FOR 5ZQ 13 NC:154: 

(i) S2QU3NC2 CHARACTERISTICS: 

(A) LZNGTH: 12 amino acids 
(3) TYP2: amino acid 
(C) STRAND2DNZSS: single 
(3) TOPOLOGY: linear 

(ii) MOISC^LZ TYPE: peptide 

{xi) SZC.uZNC2 D2SCRIPTICN: SZQ ID NO: 154: 

Gly Arg Pro Phe Arg Gin A3n Ser Pre Val Va : ^he 

1 .5 10 



(2) INFORMATION FOR SZQ ID NC:I55: 

(i) SEQUENCE CHARACTERISTICS : 

(AJ L2NGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDZDN2SS: single 

(D) TOPOLOGY: linear 

(ii) MOL2CUL2 TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 155: 

Trp Val Pre Arg Met Gly Arg His Leu Ser Tr.r Leu 
1 5 io 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 a.T.ir.o acids 
(3) TYPE: amino acid 
fC) STRAND2DNZSS: amcle 
(D) TOPOLCGY: linear 

(ii) XCLZC'JLZ TYP.Z: pepcize 

(xi) SZQUZNCZ DESCRIPTION: SZQ ID NO: 155 : 

Arg Thr Arg His Val Phe Lvs Val lie His Glv 
^ 5 10 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 amino acids 
(3) TYPE: ammo acid 

(C) STRANDEDNESS : single 

(D) TOPOLCGY : linear 

(ii) MCLZCITLZ TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO: 157: 
Asn Ala Arg Arg Met Tyr Ser Val Ala Sly Met Asp 
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(2) INFORMATION FOR SZQ 13 NO: 153 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) .MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION : SZQ ID NO: 153: 

Trp Arg Lys Phe Ala Leu Lau Glv Ser GW 

1 5 * 10* ~ 

(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amine acids 
(3) TYPE: amino acid 

(C) STRAND ZD NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION : SZQ ID NO: 159: 

His Arg Ala Tyr Arg lie Ala Thr Met Phe Ser Glv 
1 5 10 

(2) INFORMATION FOR SZQ ID NC:I6C: 

(i) SEQUENCE CHARACTERISTICS: 

(A] LENGTH: 11 a.tino acids 
(3) TYPE: amine acid 
(C) STRANDEDNESS: single 
( 0 ) TOPOLOGY: 1 — ear 

(ii) MCLZCULZ TYPZ: peptide 

(*l) SZQUZNCZ DZSCRIPTICN: SEQ ID NO: 150: 

Arg Gly Lau Met: Arg Arg Ser Thr Lys Thr 7a' 
1 5 10 

(2} INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino azids 
(3) TYPZ: amino acid 

(C) STRAND ED NES S : Single 

(D] TOPOLOGY: linear 

(ii) MOLZCULZ TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 151: 
Ala Arg His Arg Met Phe Gin Tro Ala Met Val Glv 

: s id 
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(2) INFORMATION FOR SEQ ID NC:1S2: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY: linear 

<ii) MCLSCUI.S TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 152: 

lie Met: He Gly Lys Ciu Glv Ala Val Ser Se- Se- 
1 5 * io 
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WHAT IS CLAIMED IS: 

1 1. A method of isolating a DNA binding protein 

2 comprising: 

3 (a) providing a recombinant CNA vector comprising a 

4 coding sequence for a peptide having a specific affinity for a 

5 receptor; 

6 (b) inserting a library of oligonucleotides encoding 

7 different potential DNA binding proteins into the vector 

3 in-frame with the peptide coding sequence to form a library of 

9 different vectors encoding different fusion proteins, the 

10 fusion proteins differing in the potential DNA binding 

11 protein; 

12 (c) transforming host ceils with the vectors; 

13 (d) cuituring the transformed host cells under 

14 conditions suitable for expression of the fusion proteins, 

15 whereby, if a fusion protein comprises a potential DNA binding 
15 protein vith affinity for the vector, the fusion protein binds 
17 to the vector to form a corr.plex; 

13 (e) lysing the transformed host cells under 

19 conditions such that complexes formed in (d) remain 

20 associated; 

21 (f) contacting the complexes with a receptor under 

22 conditions conducive to specific binding of the peptide to tne 

23 receptor; 

24 (g) isolating complexes bound to the receptor, the 

25 complexes containing vectors encoding DNA binding proteins. 

1 2. The method of claim 1, further comprising 

2 isolating the vectors from the complexes in (g) , and repeating 

3 (c)-(g). 

1 3. The method of claim 2, further comprising 

2 determining the sequence of a DNA binding protein encoded by a 

3 vector in (g) . 

1 4. The method of claim 3, further comprising: 

2 transforming the vector in (g) into host cells under 

3 conditions suitable for expression of the fusion protein 
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4 encoded by the vector, whereby the fusion protein binds to the 

5 vector to form a complex; 

6 lysing the transformed host cells under conditions 

7 such that the complex regains associated; 

3 contacting separate samples cf the complex to the 

9 receptor and to a receptor lacking affinity for the peptide 

10 under conditions conducive to specific binding of the peptide 

11 to the receptor; 

12 isolating vector from: (1) complex bound to the 

13 receptor and (2) complex bound to the receptor lacking 

14 affinity for the peptide; 

lz> separately transforming vector obtained from (1) and 

16 (2) and calculating an enrichment ratio equal to transf ormants 

17 from (1) divided by transf ormants from (2) , the enrichment 
13 ratio being a measure of the suitability of the DNA binding 

19 protein for displaying the peptide for specific binding to the 

20 receptor. 

1 5. The method cf claim 2, wherein the potential DNA 

2 binding proteins are variants of a natural DNA binding 

3 protein. 

1 5. The method of claim 5, wherein the natural DNA 

2 binding protein is iacl. 

1 7. The method of claim 6, wherein the vector lacks 

2 a lacO site. 

1 3. The method of claim 7, wherein the potential DNA 

2 binding proteins are variants of a headpiece dimer comprising 

3 two lac headpieces joined by a linker. 

1 9. The method cf claim 2, further comprising 

2 contacting the complexes with bulk DNA to compete with the 

3 vectors for binding tc the potential DNA binding proteins. 

1 10. A method of constructing a random peptide 

2 library comprising: 
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(a) providing a recombinant DNA vect:r that, encodes 
a DNA binding protein ether than a phage coat protein; 

(■b) inserting into the coding sequence cf the DNA 
binding protein a coding sequence for a random peptide such 
that the resulting vectors encode fusion proteins, each of 
which comprises the DNA binding protein and a random peptide; 

(c) transforming hose cells with the vectors; and 

(d) culturing the transformed host cells under 
conditions suitable for expression of the fusion proteins, 
wherein the fusion proteins bind via the DNA binding protein 
to the vector with sufficient stability that complexes having 
a random peptide with a specific affinity for a receptor can 
be enriched by affinity purification on the receptor from 
complexes lacking a random peptide with a specific affinity 
for the receptor. 

11. The method of claim 10, wherein the DNA binding 
protein is a nonsequencs-specif ic DMA binding protein. 

12. A method for screening a random peptide library 
for a peptide with specific affinity for a receptor, 
comprising : 

(a) providing a peptide library wherein eacn member 
is a host cell transformed with a recombinant DNA vector that 
encodes a fusion protein comprising a DNA binding protein and 
a coding sequer.es for a random peptide, wherein members differ 
from other members with respect to the sequence of the random 
peptide, wherein the fusion proteins can bind via the DNA 
binding protein to the vector to form complexes having 
sufficient stability that complexes having a random peptide 
with a specific affinity for a receptor can be enriched by 
affinity purification to the receptor from complexes lacking a 
random peptide with a specific affinity for the receptor; 

(b) lysine the cells transformed with the random 
peptide library under conditions such that the fusion protein 
remains bound to the vector that encodes the fusion protein; 
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13 ( c) contacting the fusion proteins cf the random 

19 peptide library with a receptor under conditions conducive to 

20 specific peptide-receptor binding; and 

21 (d) isolating the vector that encodes a rand on 

22 peptide that binds to said receptor. 

1 13. The method of claim 12, wherein the DNA binding 

2 protein has been isolated by the method cf claim 1. 

1 14. The method of claim 13, wherein the DNA binding 

2 protein is a nonsequence-specif ic DNA binding protein. 

1 15. The method of claim 13, wherein the vector lacks 

2 a iacO site. 

1 16. The method of claim 13, wherein the recombinant 

2 vector further comprises a DNA sequence with a specific 

3 affinity for the DNA binding protein. 

1 17. The method of claim 12, wherein the host cells 

2 are bacteria . 

1 13. The method of claim 17, wherein the bacteria are 

2 I . cell , and the vector is a plasm id. 



19. The method of claim 13, wherein the DNA binding 
protein is a lac repressor protein comprising two lac 

3 headpieces joined by a first linker and the DNA binding 

4 protein is joined to the random peptide by a second linker. 

1 20. The method of claim 19, wherein the first linker 

2 is GRCR, the two lac headpieces are designated A4 . 5 in rig. 6 

3 and the second linker is RSQ£ . 

1 21. The method cf claim 19, wherein the first linker 

2 is GRCR, the two lac headpieces are designated 34.5 in fig. 6, 

3 and the second linker is G?NQ. 
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1 22. The method cf claim 12, wherein the random 

2 peptide is located at the carbcxy terminus cf said fusion 

3 protein . 

1 23. The method cf claim 12 , wherein the library has 

2 at least 10 6 different members . 

1 24. The method of claim 12 further comprising: 

2 (e) transferring a host cell with the vectors 

3 obtained in (d) ; and repeating (b) , (c) , and (d) with the host 

4 cells transformed in (e) . 

1 25. A recombinant DNA vector for constructing the 

2 random peptide library of claim 10, said vector comprising: 

3 (a) a DNA sequence encoding the DNA binding protein; 
^ (b) a promoter positioned so as to drive 

5 transcription of the DNA binding protein coding sequence; 

5 (c) a coding sequence for a peptide inserted in the 

7 DNA binding protein coding sequence so that the coding 

3 sequences can be transcribed to produce an RNA transcript that 

9 can be translated to produce a fusion protein capable of 

10 binding to at least one DNA sequence in the vector. 

- 2 6. A host cell transformed with the vector cf 

2 claim 25. 

1 2 7. A random peptide library comprising at least 10 3 

2 different members, wherein each member is a host cell 

3 transformed with a recombinant DNA vector that encodes a 

4 fusion protein comprising a DNA binding protein other than a 

5 phage coat protein and a random peptide; and wherein members 

6 differ from other members with respect to the sequence of the 

7 random peptide, wherein the fusion proteins can bind via the 
3 DNA binding protein to the vector to form complexes having 

5 sufficient stability that complexes having a random peptide 

10 with a specific affinity for a receptor can be enriched by 

11 affinity purification to the receptor from complexes lacking a 

12 random peptide with a specific affinity for the receptor. 
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